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Abstract. We prove the existence of Cantor families of small amplitude, linearly stable, 
quasi-periodic solutions of quasi-linear (also called strongly nonlinear) autonomous Hamil¬ 
tonian differentiable perturbations of the mKdV equation. The proof is based on a weak 
version of the Birkhoff normal form algorithm and a nonlinear Nash-Moser iteration. The 
analysis of the linearized operators at each step of the iteration is achieved by pseudo- 
differential operator techniques and a linear KAM reducibility scheme. 
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1 Introduction and main result 

In the paper [5] we proved the first existence result of quasi-periodic solutions for 
autonomous quasi-linear PDEs (also called “strongly nonlinear J ’ in |24j). in partic¬ 
ular of small amplitude quasi-periodic solutions of the KdV equation subject to a 
Hamiltonian quasi-linear perturbation. The approach developed in [5j (see also m 
is of wide applicability for quasi-linear PDEs in 1 space dimension. In this paper we 
take the opportunity to explain the general strategy of [5] applied to a model which 
is slightly simpler than KdV. 

We consider the cubic, focusing or defocusing, mKdV equation 

Ut T U X xx T ? 9 x (u ) + J\f^[x , U, Uxi U X xi 'U’xxx) — 0 j k — dzl, (1 -1) 

under periodic boundary conditions i£l:= M/27rZ, where 

A /4 ( 3b U , U x i U xx j ®ni) • — d x [(9u/) (x, U , 'U 3; ) d x (( d Ux f) (x 7 U , ICr))] (1 -2) 

is the most general quasi-linear Hamiltonian (local) nonlinearity. Note that A /4 
contains as many derivatives as the linear vector held d xxx . It is a quasi-linear 
perturbation because A /4 depends linearly on the highest derivative u xxx multiplied 
by a coefficient which is a nonlinear function of the lower order derivatives u, u x ,u xx . 
The equation (11.11) is the Hamiltonian PDE 

u t = X H {u ), X H (u) := d x XH(u ), (1.3) 

where XH denotes the L 2 (T X ) gradient of the Hamiltonian 

H( u ) = T f u 2 x dx — ~ I u 4 dx + f f(x,u,u x )dx (1.4) 

2 Jt 4 Jj Jj 
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on the real phase space 


H^(T X ) := |u(x) € ^(T.M) : J u(x) dx = oj (1.5) 

endowed with the non-degenerate symplectic form 

H(u, v) := j {d~ l u)vdx, Vit, v £ Hq (T x ) , (1.6) 

J T 

where d~ 2 u is the periodic primitive of u with zero average. The phase space Hq (T x ) 
is invariant for the evolution of m because the integral f T u(x) dx is a prime 
integral (the mass). For simplicity we fix its value to f T u(x) dx = 0. We recall that 
the Poisson bracket between two functions F, G : H^(T X ) —>• M is defined as 

{F,G}(u) := n{X F (u),X G (u)) = f VF(u)d x VG(u)dx. (1.7) 

J T 

We assume that the “Hamiltonian density” / is of class C q {T x M x R;R) for 
some q large enough (otherwise, as it is well known, we cannot expect the existence 
of smooth invariant KAM tori). We also assume that / vanishes of order five around 
u = u x = 0 , namely 

\f(x,u,v)\<C(\u\ + \v\) 5 \/(u,v) £R 2 , M + M < 1. (1.8) 

As a consequence the nonlinearity A /4 vanishes of order 4 at u = 0 and ( 11 . 11 ) may be 
seen, close to the origin, as a “small” perturbation of the cubic rnKdV equation 

Ut -(- Uxxx T Ux — 0 . (1.9) 

Such equation is known to be completely integrable. Actually it is mapped into 
KdV by a Miura transform, and it may be described by global analytic action-angle 
variables, as it was proved by Kappeler-Topalov [20]. We also remark that, among 
the generalized KdV equations ut + u xxx ± d x {u p ) = 0, p £ N, the only known 
completely integrable ones are the KdV p = 2 and the cubic mKdV p = 3. 

It is a natural question to know whether the periodic, quasi-periodic or almost 
periodic solutions of (11.91) persist under small perturbations. This is the content of 
KAM theory. It is a difficult problem because of small divisors resonance phenomena, 
which are especially strong in presence of quasi-linear perturbations like A/ 4 . 

In this paper (as well as in 0 ) we restrict the analysis to the search of small 
amplitude solutions. It is also a very interesting question to investigate possible 
extensions of this result to perturbations of finite gap solutions. A difficulty which 
arises in the search of small amplitude solutions is that the rnKdV equation m is 
a completely resonant PDE at u = 0, namely the linearized equation at the origin is 
the linear Airy equation 

Ut T u xxx — 0 

which possesses only the 27r-periodic in time, real solutions 

u(t,x) = ^ Uje iq3t e iqx , U-j = Uj. (1.10) 

jez\{0} 
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Thus the existence of small amplitude quasi-periodic solutions of ED is entirely 
due to the nonlinearity. Indeed, the nonlinear term x (u 3 ) is the one that pro¬ 
duces the main modulation of the frequency vector of the solution with respect to 
its amplitude (the well-known frequency-to-action map, or frequency-amplitude re¬ 
lation, or “twist”, see (14.101) 1 and that allows to “tune” the action parameters £ so 
that the frequencies becomes rationally independent and diophantine. Note that 
the mKdV equation (11.11) does not depend on other external parameters which may 
influence the frequencies. This is a further difficulty in the study of autonomous 
PDEs with respect to the forced cases studied in [3j. Actually, in [3] we consid¬ 
ered non-autonomous quasi-linear (and fully nonlinear) perturbations of the Airy 
equation and we used the forcing frequencies as independent parameters. 

The core of the matter is to understand the perturbative effect of the quasi-linear 
term A /4 over infinite times. By (11.81) . close to the origin, the quartic term A /4 is 
smaller than the pure cubic mKdV (11.91) . Therefore, when we restrict the equation 
to finitely many space-Fourier indices \j\ < C , we essentially enter in the range of 
applicability of finite dimensional KAM theory close to an elliptic equilibrium. The 
new problem is to understand what happens to the dynamics on the high frequencies 
\j\ —> + 00 , since A /4 is a nonlinear differential operator of the same order (i.e. 3) as 
the constant coefficient linear (and integrable) vector held d xxx . 

Does such a strongly nonlinear perturbation give rise to the formation of singu¬ 
larities for a solution in finite time, as it happens for the quasi-linear wave equations 
considered by Lax Em and Klainerman-Majda tm? Or, on the contrary, does the 
KAM phenomenon persist nevertheless for the mKdV equation ED ? The answer 
to these questions has been controversial for several years. For example, Kappeler- 
Poschel [19] (Remark 3, page 19) wrote: “It would be interesting to obtain pertur¬ 
bation results which also include terms of higher order, at least in the region where 
the KdV approximation is valid. However, results of this type are still out of reach, 
if true at all”. 

We think that these are very important dynamical questions to be investigated, 
especially because many of the equations arising in Physics are quasi-linear or even 
fully nonlinear. 

The main result of this paper proves that the KAM phenomenon actually per¬ 
sists, at least close to the origin, for quasi-linear Hamiltonian perturbations of mKdV 
(the same result is proved in [5] for KdV). More precisely, Theorem 11.11 proves the 
existence of Cantor families of small amplitude, linearly stable, quasi-periodic solu¬ 
tions of the mKdV equation ED subject to quasi-linear Hamiltonian perturbations. 
It is not surprising that the same result applies for both the focusing and the de- 
focusing mKdV because we are looking for small amplitude solutions. Thus the 
different sign ? = ±1 only affects the branch of the bifurcation. 

From a dynamical point of view, note that the parameters £ selected by the 
KAM Theorem 11.11 give rise to solutions of ED-ED which are global in time. 
This is interesting information because, as far as we know, there are no results of 
global or even local solutions of the Cauchy problem for ED-ED, and such PDEs 
are in general believed to be ill-posed in Sobolev spaces (for a rough result of local 
well-posedness for ED-ED see 0)- 

The iterative procedure we are going to present is able to select many parameters 
£ which give rise to quasi-periodic solutions (hence defined for all times). This 
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procedure works for parameters belonging to a finite dimensional Cantor like set 
which becomes asymptotically dense at the origin. 

How can this kind of result be achieved? The proof of Theorem 11.11 - which we 
shall discuss in more detail later - is based on an iterative Nash-Moser scheme. As 
it is well known, the main step of this procedure is to invert the linearized opera¬ 
tors obtained at each step of the iteration and to prove that the inverse operators, 
albeit they lose derivatives (because of small divisors), satisfy tame estimates in 
high Sobolev norms. The linearized equations are non-autonomous linear PDEs 
which depend quasi-periodically on time. The key point of this paper (and [5]) is 
that, using the symplectic decoupling of [lOj, some techniques of pseudo-differential 
operators adapted to the symplectic structure, and a linear Birkhoff normal form 
analysis, we are able to construct, for most diophantine frequencies, a time depen¬ 
dent (quasi-periodic) change of variables which conjugates each linearized equation 
into another one that is diagonal and has constant coefficients, that is, in “normal 
form”. This means that, in the new coordinates, we have integrated the equations. 
Then we easily invert the linearized operator (recall that the inverse loses derivatives 
because of small divisors) and we conjugate it back to solve the linear equation in 
the original set of variables. We remark that these quasi-periodic Floquet changes 
of variable map Sobolev spaces of arbitrarily high norms into itself and satisfy tame 
estimates. Hence the inverse operator also loses derivatives, but it satisfies tame 
estimates as well. 

In the dynamical systems literature, this strategy is called “reducibility” of the 
equation and it is a quasi-periodic KAM perturbative extension of Floquet theory 
(Floquet theory deals with periodic solutions of finite dimensional systems). The 
difficulty to make it work in the present setting is due to the quasi-linear character 
of the nonlinearity in (11.11) . 

Before stating precisely our main result we shortly present some related litera¬ 
ture. In the last years a big interest has been devoted to understand the effect of 
derivatives in the nonlinearity in KAM theory. For unbounded perturbations the 
first KAM results have been proved by Kuksin [23] and Kappeler-Poschel [19| for 
KdV (see also Bourgain [12] ), and more recently by Liu-Yuan m, Zhang-Gao-Yuan 
[2?] for derivative NLS, and by Berti-Biasco-Procesi m-m for derivative NLW. For 
a recent survey of known results for KdV, we refer to M- Actually all these results 
still concern semi-linear perturbations. 

The KAM theorems in m , m prove the persistence of the finite-gap solutions 
of the integrable KdV under semilinear Hamiltonian perturbations ed x (d u f)(x,u), 
namely when the density / is independent of u X: so that m is a differential operator 
of order 1. The key idea in [23] is to exploit the fact that the frequencies of KdV 
grow as ~ j 3 and the difference | j 3 — i 3 | > \{j 2 + i 2 ), i ^ j, so that KdV gains 
(outside the diagonal) two derivatives. This approach also works for Hamiltonian 
pseudo-differential perturbations of order 2 (in space), using the improved Kuksin’s 
lemma proved by Liu-Yuan in [18] • However it does not work for the general quasi- 
linear perturbation in (11.21) . which is a nonlinear differential operator of the same 
order as the constant coefficient linear operator d xxx . 

Now we state precisely the main result of the paper. The solutions we find are, at 
the first order of amplitude, localized in Fourier space on finitely many “tangential 
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sites' 


S + ■= {S := {±j : j € S + } , j;GN\{0} Vz = (1.11) 


The set S is required to be even because the solutions u of (O) have to be real val¬ 
ued. Moreover, we also assume the following explicit “non-degeneracy” hypothesis 
on S: ^ 

_ i ^ {^ 2 + kj + k 2 : j,k G Z\ S , j ^ fc|. (1-12) 

i =1 

Theorem 1.1 (KAM for quasi-linear perturbations of mKdV). Given v G N, let 
f G C q (with q := q(v) large enough) satisfy () 1.8 f) . Then, for all the tangential sites 
S as in (m satisfying (11.121) . the mKdV equation (|l.ll) possesses small amplitude 
quasi-periodic solutions with diophantine frequency vector ui := w(£) = (wj)j e s+ G 
W of the form 


u(t,x)= E 2 Vii cos(ujjt + jx) + o(v / |^|), (1.13) 

16S+ 


where 

coj ■= j 3 + 3- 2( E &')]•?> 3 G s+ > l 1 - 14 ) 

j'£S+ 

for a “Cantor-like” set of small amplitudes £ G with density 1 at £ = 0. The 
term o(-\/j£|") in (11.131) is a function u±(t, x) = u\(ojt, x), with u\ in the Sobolev space 
H S (T U+1 ,M.) of periodic functions, and Sobolev norm IMU = o(v / H[) as £ —0, for 
some s < q. These quasi-periodic solutions are linearly stable. 

If the density f(u , u x ) is independent on x, a similar result holds for all the 
choices of the tangential sites, without assuming (11.121) . 


This result is deduced from Theorem 15.II It was announced also in m-m under 
the stronger condition on the tangential sites 
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2u - 1 


E -? 2 

i =1 


(1.15) 


Let us make some comments. 

1. In the case v = 1 (time-periodic solutions), the condition () 1.12 D is always 
satisfied. Indeed, suppose, by contradiction, that there exist integers f\ > 1, 
j, k G Z such that 

2Ji 2 = j 2 + jk + k 2 . (1.16) 

Then j 2 + jk + k 2 is even, and therefore both j and k are even, say j = 2n, 
k = 2m with n, m G Z. Hence 2J 2 = 4(n 2 -fnm-fm 2 ), and this implies that j\ is 
even, say j\ = 2 p for some positive integer p. It follows that 2p 2 = n 2 +nm+m 2 , 
namely p,n,m satisfy (11.161) . Then, iterating the argument, we deduce that 
Ji can be divided by 2 infinitely many times in N, which is impossible. 
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2. When the density f(u, u x ) is independent of x, the L 2 -norm 


M(u) := j v 2 dx = ||it||| 2 ( T ) (1.17) 

is a prime integral of the Hamiltonian equation (11.111 . Hence the solutions of 
m are in one-to-one correspondence with those of the Hamiltonian equation 

v t = d x VK(v ) with K := H + AM 2 , A € M . (1.18) 

More precisely, if u(t, x) is a solution of (11.11) . then v(t, x) := u(t, x — ct ), with 
c := —4A M(u), is a solution of (11 . 18|1 . Vice versa, if v(t,x) solves ([1.181) . then 
the function u(t,x ) := v{t,x + ct), with c := —4A M{y), is a solution of (11,11) 
(M (v) is also a prime integral of the equation (11.181) 1. 

The advantage of looking for quasi-periodic solutions of (I1.18P is that, for 
A = 3g/4, the fourth order Birkhoff normal form of K is diagonal (remark 13. 3 K 
and therefore no conditions on the tangential sites S are required lremark l9.9l) . 

3. The diophantine frequency vector w(£) = (wj)jeS+ E M 1 " of the quasi-periodic 
solutions of Theorem 11.11 is 0(|£|)-close as £ —> 0 (see (|1.14p ) to the integer 
vector of the unperturbed linear frequencies 

u:=(f 1 ,...,f v )€N u . (1.19) 

This makes perturbation theory more difficult. This is the difficulty due to 
the fact that the rnKdV equation is completely resonant at u = 0. 

4. As shown by () 1.13 p the expected quasi-periodic solutions are mainly supported 
in Fourier space on the tangential sites S. The dynamics of the Hamiltonian 
PDE (11.11) restricted (and projected) to the symplectic subspaces 

H S := {v = E ’ H S ■= { z = E U ^ X G ^(T,)}, (1.20) 

jeS j£S c 

where S c := {j £Z \ {0} : j ^ S}, is quite different. We call v the tangential 
variable and 2 the normal one. On Hg the dynamics is mainly governed by 
a finite dimensional integrable system (see Proposition I3TD . and we find it 
convenient to describe the dynamics in this subspace by introducing action- 
angle variable, see section [4) On the infinite dimensional subspace Hg the 
solution will stay forever close to the elliptic equilibrium z = 0. 

In Theorem 11.11 it is stated that the quasi-periodic solutions are linearly stable. 
This information is not only an important complement of the result, but also an 
essential ingredient for the existence proof. Let us explain better what we mean. 
By the general procedure in |10j we prove that, around each invariant torus, there 
exist symplectic coordinates (see (16.131) ’) 

(^, r], w) € r x IT X Hg 
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in which the mKdV Hamiltonian (11,41) assumes the normal form 

K(ip,ri,w) = uj ■ 77 + ^K 20 (^)ri ■ r\ + (if n (^))],«;) L2(T) + ^(K 02 (^)w,w) l2(j) 

+ K> 3 (ip,rj,w) (1.21) 

where K> 3 collects the terms at least cubic in the variables ( r),w ), see remark T6.51 
In these coordinates the quasi-periodic solution reads t 1 —>■ (ut, 0,0) and the corre¬ 
sponding linearized equations are 

ip = K 20 (ut)r] + 

< i] = 0 (1.22) 

w - d x K 02 (u;t)w = d x K u (ut)r]. 

Thus the actions rj(t) = 77 (0) do not evolve in time and the third equation reduces 
to the forced PDE 

w = d x K 02 (ujt)[w\ + d x Kn(ut)[rjo \. (1.23) 

Ignoring the forcing term d x Kn(ojt)[r]o] for a moment, we note that the equation 
w = d x Ko 2 (ut)[w\ is, up to a finite dimensional remainder (Proposition [73]), the 
restriction to Hg of the “variational equation” 

h = d x (d u VH)(u(ut,x))[h\ = X K (h ), 

where Xk is the KdV Hamiltonian vector field with quadratic Hamiltonian K = 
H){u)[h],h) L 2 ( Ta .) = ^(d uu H)(u)[h,h]. This is a linear PDE with quasi- 
periodically time-dependent coefficients of the form 

h t = d xx (a 1 (ujt,x)d x h) + d x (a 0 (ut,x)h ). (1-24) 

In section [ 8 ] we prove the reducibility of the linear operator w — d x Ko 2 (ujt)w , which 
conjugates (11.231) to the diagonal system (see (18.641) 1 

d t v = -YDooV + f(uit) (1-25) 

where := Op{fi°°}jGS c is a Fourier multiplier operator acting in Hj_, 

Hf := i(-m 3 j 3 + mij) + r°° £ iR, j <E S c , 

with m 3 = l + 0(e 3 ), mi = 0(e 2 ), sup^g^c r|° = o(e 2 ), see (18.611) . (18.621) . The eigen¬ 
values are the Floquet exponents of the quasi-periodic solution. The solutions 
of the scalar non-homogeneous equations 

Vj + nfvj = fj(ut ), j £ S c , fj,j° € iM , 

are 

£ A uj-lt 

Vj(t ) = cje+ Vj(t ), where Vj(t) := „ 

(recall that the first Melnikov conditions (I 8 . 66 P hold at a solution). As a consequence, 
the Sobolev norm of the solution of (11.251) satisfies 

IK*)||tf- < C'lk(O)|| h» , VteR, 
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i.e. it does not increase in time. 

We now describe in detail the strategy of proof of Theorem 11.11 Many of the 
arguments that we use are quite general and of wide applicability to other PDEs. 
Nevertheless, we think that a unique abstract KAM theorem applicable to all quasi- 
linear PDEs can not be expected. Indeed the suitable pseudo-differential operators 
that are required to conjugate the highest order of the linearized operator to constant 
coefficients highly depend on the PDE at hand, see the discussion after (11.2911 . 

There are two main issues in the proof: 

1. Bifurcation analysis. Find approximate quasi-periodic solutions of (11.11) up 
to a sufficiently small remainder (which, in our case, should be 0(u 4 )). In 
this step we also find the approximate “frequency-to-amplitude” modulation 
of the frequency with respect to the amplitude, see (14.1011 . This is the goal of 
sections [3] and [4j 

2. Nash-Moser implicit function theorem. Prove that, close to the above 
approximate solutions, there exist exact quasi-periodic solutions of (11.11) . By 
means of a Nash-Moser iteration, we construct a sequence of approximate 
solutions that converges to a quasi-periodic solution of (11.11) (sections lollUl) . 

The key step consists in proving the invertibility of the linearized operator and 
tame estimates for its inverse. This is achieved in two main steps. 

(a) Symplectic decoupling procedure. The method in Berti-Bolle [TO] 
allows to approximately decouple the “tangential” and the “normal” dy¬ 
namics around an approximate invariant torus (section [6j). It reduces the 
problem to the one of inverting a quasi-periodically forced PDE restricted 
to the normal subspace . Its precise form is found in section 17.21 

(b) Analysis of the linearized operator in the normal directions. 
In sections [Tj [8] we reduce the linearized equations to constant coefficients. 
This involves three steps: 

i. Reduction in decreasing symbols , sections 18.1118.31 and 18.51 

ii. Linear Birkhoff normal form, section 18.41 

iii. KAM reducibility, section 18.61 

All the changes of variables used in the steps i)-iii) are (^-dependent families 
of symplectic maps <!>(</?) which act on the phase space 7 Lq(T,j;). Therefore 
they preserve the Hamiltonian dynamical systems structure of the conjugated 
linear operators. 

Let us discuss these issues in detail. 

Weak Birkhoff normal form. According to the orthogonal splitting 

H^(T X ) := H S © Hg 

into the symplectic subspaces defined in (11.201) . we decompose 

u = v + z, v = := ^ Uj eB x , z = := ^ Uj ef x , (1.26) 

jes jeS c 
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where II 5 , IIg denote the orthogonal projectors on Hg, Hg. 

We perform a “weak” Birkhoff normal form (weak BNF), whose goal is to find 
an invariant manifold of solutions of the third order approximate rnKdV equation 
ED, on which the dynamics is completely integrable, see section [3j We construct 
in Proposition 13.11 a symplectic map <f >b such that the transformed Hamiltonian 
H := H o possesses the invariant subspace Hg (see (11.201) 1. To this purpose we 
have to eliminate the term f v 3 zdx (which is linear in z). Then we check that its 
dynamics on Hg is integrable and non-isocronous. For that we perform the classical 
finite dimensional Birkhoff normalization of the Hamiltonian term f v 4 dx which 
turns out to be integrable and non-isocronous. 

Since the present weak Birkhoff map has to remove only finitely many monomials, 
it is the time 1-flow map of an Hamiltonian system whose Hamiltonian is supported 
on only finitely many Fourier indices. Therefore it is close to the identity up to finite 
dimensional operators, see Proposition 13.11 The key advantage is that it modifies A /4 
very mildly, only up to finite dimensional operators (see for example Lemma ED. 
and thus the spectral analysis of the linearized equations (that we shall perform in 
section [ 8 ]) is essentially the same as if we were in the original coordinates. 

The weak normal form (13.71) does not remove (nor normalize) the monomials 
0(z 2 ). We point out that a stronger normal form that removes/normalizes the 
monomials 0(z 2 ) is also well-defined (it is called “partial Birkhoff normal form” in 
Kuksin-Poschel [25] and Poschel [26]). However, we do not use it because, for such 
a stronger normal form, the corresponding Birkhoff map is close to the identity only 
up to an operator of order 0{d ~ l ), and so it would produce terms of order d xx and 
d x . For the same reason, we do not use the global nonlinear Fourier transform in 
[ 2D] (Birkhoff coordinates), which is close to the Fourier transform up to smoothing 
operators of order 0(d~ 1 ) (this is explicitly proved for KdV). 

We remark that mKdV is simpler than KdV because the nonlinearity in (11.11) is 
cubic and not only quadratic, and, as a consequence, less steps of Birkhoff normal 
form are required to reach the sufficient smallness for the Nash-Moser scheme to 
converge (see Remark 19.21) . 

Action-angle and rescaling. At this point we introduce action-angle variables on the 
tangential sites (section |4|) and, after the rescaling (14.51) . we look for quasi-periodic 
solutions of the Hamiltonian (14.91) . Note that the coefficients of the normal form A f 
in (14.131) depend on the angles 6, unlike the usual KAM theorems [26], [22], where 
the whole normal form is reduced to constant coefficients. This is because the weak 
BNF of section [3] did not normalize the quadratic terms 0(z 2 ). These terms are 
dealt with the “linear Birkhoff normal form” (linear BNF) in section [8~fl In some 
sense the “partial” Birkhoff normal form of [26] is split into the weak BNF of section 
[3] and the linear BNF of sections 18.41 

The present functional formulation with the introduction of the action-angle 
variables allows to prove the stability of the solutions (unlike the Lyapunov-Schmdit 
reduction approach). 

Nonlinear functional setting and approximate inverse. We look for a zero of the 
nonlinear operator (15.61) . where the unknown is the torus embeddeding (p 1 —>• 
and where the frequency u is seen as an “external” parameter. This formulation is 
convenient in order to verify the Melnikov non-resonance conditions required to in¬ 
vert the linearized operators at each step. The solution is obtained by a Nash-Moser 
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iterative scheme in Sobolev scales. The key step is to construct (for uj restricted to a 
suitable Cantor-like set) an approximate inverse (a la Zehnder [30]) of the linearized 
operator at any approximate solution. Roughly, this means to find a linear operator 
which is an inverse at an exact solution. A major difficulty is that the tangential 
and the normal dynamics near an invariant torus are strongly coupled. 

Symplectic approximate decoupling. The above difficulty is overcome by implement¬ 
ing the abstract procedure in Berti-Bolle m, which was developed in order to prove 
the existence of quasi-periodic solutions for autonomous NLW (and NTS) with a mul¬ 
tiplicative potential. This approach reduces the search of an approximate inverse 
for (15.61) to the invertibility of a quasi-periodically forced PDE restricted to the 
normal directions. This method approximately decouples the tangential and the 
normal dynamics around an approximate invariant torus, introducing a suitable set 
of symplectic variables 

{ip, v, w) G TT x IT x h£ 

near the torus, see (16.131) . Note that, in the first line of (16.131) . ip is the “natural” 
angle variable which coordinates the torus, and, in the third line, the normal variable 
2 is only translated by the component zq{ ip) of the torus. The second line completes 
this transformation to a symplectic one. The canonicity of this map is proved in 
[TO] using the isotropy of the approximate invariant torus is , see Lemma 16.31 In 
these new variables the torus ip >->■ ig{ip) reads ip >->• ('0,0,0). The main advantage 
of these coordinates is that the second equation in (16.221) (which corresponds to the 
action variables of the torus) can be immediately solved, see (|6.24l) . Then it remains 
to solve the third equation (16.251) . i.e. to invert the linear operator C u . This is a 
quasi-periodic Hamiltonian perturbed linear Airy equation of the form 

h C u h \= Hsiyj-d^h + d xx {aid x h) + d x (a 0 h ) + d x 7Zh) , Vh E Hg , (1.27) 

where 77 is a finite dimensional remainder. The exact form of C u is obtained in 
Proposition 17.41 see (17.231) . 

Reduction to constant coefficients of the linearized operator in the normal directions. 
In section[ 8 ]we conjugate the variable coefficients operator to a diagonal operator 
with constant coefficients which describes infinitely many harmonic oscillators 

Vj + nfvj = 0 , nf := i(-m 3 j 3 + mij) + rf G iR, j £ S , (1.28) 

where the constants m 3 — 1, m\ G R and sup ? |r?°| are small, see Theorem 18.151 
The main perturbative effect to the spectrum (and the eigenfunctions) of is due 
to the term a\(ut,x)d xxx (see (11.271) 1. and it is too strong for the usual reducibility 
KAM techniques to work directly. The conjugacy of with (11.281) is obtained in 
several steps. The first task (obtained in sections I8.1H8.5I) is to conjugate C u to 
another Hamiltonian operator of Hg with constant coefficients 

C 5 := Ug (uj ■ + m 3 d xxx + rriid x + R 5 )Ug , m i,m 3 GR, (1.29) 

up to a small bounded remainder R 5 = 0(<9°), see (18.561) . This expansion of 
in “decreasing symbols” with constant coefficients follows [3], and it is somehow 
in the spirit of the works of Iooss, Plotnikov and Toland m - [15] in water waves 
theory, and Baldi [2j for Benjamin-Ono. It is obtained by transformations which 
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are very different from the usual KAM changes of variables. We underline that the 
specific form of these transformations depend on the structure of mKdV. For other 
quasi-linear PDEs the analogous reduction requires different transformations, see 
for example Alazard-Baldi [lj, Berti-Montalto HH for recent developments of these 
techniques for gravity-capillary water waves, and Feola-Procesi [13] for quasi-linear 
forced perturbations of Schrodinger equations. 

The transformation of (11.2711 into (11.2911 is made in several steps. 

1. Reduction of the highest order. The first step (section 18.lH is to eliminate the 
x-dependence from the coefficient ai(cot, x)d xxx of the Hamiltonian operator 
Cuj. In order to find a symplectic diffeomorphism of Hg near A ±, the starting 
point is to observe that the diffeomorphism (see (18.Ill 1 

u (->• (Au)((p, x) := (1 + f3 x (<p, x))u(ip, x + /3((p, x)), 


is, for each ip € T", the time-one flow map of the time dependent Hamiltonian 
transport linear PDE 


d r u = d x (b(tp, r,x)u ), b(ip, t , x) 


1 + T/3 x (cp,x) ’ 


(1.30) 


Actually the flow of (11.3011 is the path of symplectic diffeomorphisms 


u(tp, x) (1 + rfl x (<p, x))u(tp, x + rP(ip, x)) , r £ [ 0 ,1] . 

Thus, like in [5], we conjugate with the symplectic time 1 flow map of the 
projected Hamiltonian equation 

d T u = n gd x (b(r,x)u) = d x (b(r,x)u) - n s d x {b(r,x)u ), u £ Hg (1.31) 

generated by the the quadratic Hamiltonian ^ f T b(r , x)u 2 dx restricted to Hg. 
By Lemma 18.11 (which was proved in 0) such symplectic map differs from 
A± := H-jfAnj only for finite dimensional operators. 

This step may be seen as a quantitative application of the Egorov theorem, see 
128! , which describes how the principal symbol of a pseudo-differential operator 
(here ai(u>t, x)d xxx ) transforms under the flow of a linear hyperbolic PDE (here 

(USD)- 

Because of the Hamiltonian structure, the previous step also eliminates the 
term 0(d xx ), see (18.1311 . In section 18721 we eliminate the time-dependence of 
the coefficient at the order d xxx . 

2. Linear Birkhoff normal form. In section 18.41 we eliminate the variable coef¬ 
ficient terms at the order 0 (e 2 ), which are present in the operator see 
(17.2311 - (17.2411 . This is a consequence of the fact that the weak BNF procedure 
of section [3] did not touch the quadratic terms 0(z 2 ). These terms cannot be 
reduced to constants by the perturbative scheme in section 18.61 (developed in 
m which applies to terms R such that Rj 1 < 1 where 7 is the diophan- 
tine constant of the frequency vector 00 (the case in [3| is simpler because the 
diophantine constant is 7 = 0(1))- Here, as well as in [5j, since rnKdV is com¬ 
pletely resonant, such 7 = o(e 2 ), see (15.311 . The terms of size e 2 are reduced 
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to constant coefficients in section 18.41 by means of purely algebraic arguments 
(linear BNF), which, ultimately, stem from the complete integrability of the 
fourth order BNF of the mKdV equation (11.91) . More general nonlinearities 
should be dealt with the normal form arguments of Procesi-Procesi |27| for 
generic choices of the tangential sites. 

Complete diagonalization of (|1.29l) . In section 18.61 we apply the abstract KAM re- 
ducibility Theorem 4.2 of [3], which completely diagonalizes the linearized operator, 
obtaining (11.281) . The required smallness condition (18.581) for R§ holds, after that 
the linear BNF of section l8~41 has put into constant coefficients the unbounded terms 
of nonperturbative size e 2 , and the conjugation procedure of sections [8.1118.31 and 18.51 
has arrived to a bounded and small remainder R§. 

The Nash-Moser iteration to an invariant torus embedding. In section [9] we perform 
the nonlinear Nash-Moser iteration which finally proves Theorem 15.II and, therefore, 
Theorem 11.11 The smallness condition that is required for the convergence of the 
scheme is e 2 ||J r ((/?, 0, 0 )|| So+M 7^ 2 sufficiently small, see (19.51) . It is verified because 
||Ap(c/?, 0,0)|| s < s e 5 ~ 2b (Lemma 15.31) and 7 = e 2+a with a > 0 small. See also 
remark H7T?1 for a comparison between the smallness condition required here with the 
one in [5]- 


Notation. We shall use the notation 

a < s b a < C(s)b for some constant C(s ) > 0 . 

We denote by no the operator 

u 1 —y 7 To(if) := u — —— [ udx. 

2 n Jj 


(1.32) 


2 Functional setting 


For a function u : Ct 0 —*■ E, uj u(oj), where ( E , || ||e) is a Banach space and 0 o is 
a subset of IF, we define the sup-norm and the Lipschitz semi-norm 


\u 


I sup_ 


I E 
I lip_ 


= ll«ll e,q 0 : = SU P IN w )lb> 

L 


1 lip _ H'u(wi) - u(u 2 )\\e 
\E, n 0 ■— SU P 1 ,, _ ,, 1 

uj±^uj2 1^1 ^ 2 | 


and, for 7 > 0, the Lipschitz norm 


|Lip(7)_11, 11 Li P('y)_ 

I E •— ll u ll£,Q 0 •“ 


ir+7wfe p . 


If E = H s we simply denote := ||ri||s ip ^ 7 ^. 

Sobolev norms. We denote by 

IMI s ■= ||w||_H'7T‘"+ 1 ) : = 


( 2 . 1 ) 


( 2 . 2 ) 


(2.3) 
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the Sobolev norm of functions u = u(ip,x) in the Sobolev space H S (T U+1 ). We 
denote by || || h* the Sobolev norm in the phase space of functions u := u(x) G 
H S (T). Moreover || ||#s denotes the Sobolev norm of scalar functions, like the 
Fourier components Uj((p). 

We fix so := {v + 2)/2 so that H S °(T U+1 ) <—)■ L°°(T l/+1 ) and any space i7 s (T l/+1 ), 
s > so, is an algebra and satisfy the interpolation inequalities: for s > so, 

||HU < C7(s 0 )||«|| a |klUo + C(s)IMLIMU) Vu,u € H s (T d ). 

The above inequalities also hold for the norms || ||s ip ^. 

We also denote 

H s s± {T u+1 ) := {n G H S (T V+1 ) : u(<p, •) G Hg \/tp G T"} , 

H%{T v+1 ) := {u G H S (T V+1 ) : u{ip, ■) eH s \/<p£ T"} . 


Matrices with off-diagonal decay. A linear operator can be identified, as usual, 
with its matrix representation. We recall the definition of the s-decay norm (intro¬ 
duced in 0) of an infinite dimensional matrix. 

Definition 2.1. Let A := (A 1 f l ) tl b > 1, be an infinite dimensional matrix. 

Its s-decay norm |A| S is defined by 

K--= E » 2 ' ( si, p I4D 2 - P' 4 ) 

i&iP ll ~ l2=l 

For parameter dependent matrices A := A(oj), u G 0 o C M", the definitions (12.11) 
and m become 

|A| s s up := sup \A(u)\ 8 , |A|i ; P := sup ~ A M\° , (2.5) 

and |A|^ ip(7) := |A|f p + 7 |A|“ P . 

Such a norm is modeled on the behavior of matrices representing the multipli¬ 
cation operator by a function. Actually, given a function p G H s (T b ), the mul¬ 
tiplication operator h > ph is represented by the Toplitz matrix T\ = Pi—p and 
|T| S = ||p|| s . If p = pioj) is a Lipschitz family of functions, then 

= IIj>C‘ p(i) • 

The s-norrn satisfies classical algebra and interpolation inequalities proved in [3]. 

Lemma 2.1. Let A = A(uj), B = B{uj) be matrices depending in a Lipschitz way on 
the parameter oj G Ll a C IT. Then for all s > sq > 6/2 there are C(s ) > C(so) > 1 
such that 



IAB^pW < C'(s)|A|^ ip W|B|L i P(T'), 

\AB\Y P ^ < ^(^lAI^PWl^lLiplT) +C(so) | j4 |Lip(7)| i? |Lip( 7 )_ 
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The s-decay norm controls the Sobolev norm, namely 


\\Ah\\Y pM < C( a )(|A|S p(7 )||/i||^P( 7) + 


Let now b := u + 1. An important sub-algebra is formed by the Toplitz in time 
matrices defined by 

whose decay norm (12.41) is 

E (,™ P 

jez,iez v 

These matrices are identified with the ^-dependent family of operators 


A W) := , A» - E, £ z-4(')<=“'" 

which act on functions of the x-variable as 

A(y?) : /i(s) = ^ hje 1JX A(ip)h(x) = ^ Af^h^e 1 ^ . 

jez ji,hez 


All the transformations that we construct in this paper are of this type (with 
j. ji. j ‘2 / 0 because they act on the phase space Hq(T x )). 

Definition 2.2. We say that 

1. an operator ( Ah)(tp,x ) := A(tp)h(tp,x) is symplectic if each A(<p), <p G T u , is 
a symplectic map of the phase space (or of a symplectic subspace like H(r) 

2. the operator uj- dp—d x G(ip) is Hamiltonian if each G(<p), p £ T u , is symmetric; 

3. an operator is real if it maps real-valued functions into real-valued functions. 

A Hamiltonian operator is transformed, under a symplectic map, into another 
Hamiltonian operator, see [3j-section 2.3. 

We conclude this preliminary section recalling the following well known lemmata 
about composition of functions (see, e.g., Appendix of 0)- 

Lemma 2.2 (Composition). Assume f G C s {T d x Hi), Hi := {y € M m : \y\ < 1}. 
Then Vu £ H s ( T rf ,R m ) such that || < 1, the composition operator f(u)(x) := 

f(x,u(x)) satisfies ||/(u)|| s < C'||/||c s (|M|s + 1) where the constant C depends on 
s, d. If f G C s+2 and ||u + < 1, then for k = 0,1 

ll/ ( „ + h) - e %i. < cii/n c .« nhiii«(iw. + iihiii~ii«iu). 

i=o *■ 

The statement also holds replacing || || s with the norms | | Sj00 of W s, °°(T d ). 
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Lemma 2.3 (Change of variable). Let p G W s ’°°(Y d , M 0 '), s > 1, with ||p||wi,ao 
< 1/2. Then the function f(x) = x+p(x ) is invertible, with inverse f~ l (y) = y+q(y ) 
where q G lL s,00 (T rf , M d ), and ||g||w<>,°o < C||p||n/ s ’°° ■ 

If, moreover, p depends in a Lipschitz way on a parameter w € SI C R 1 ', and 
||AeP||l°° < 1/2 for all u, then ||g||^^ < C'HpII^+i^oo- The constant C := C(d,s ) 
is independent of 7. 

If u G _fP(T rf ,C), t/ien {uo f)(x) := u(x + p(x)) satisfies 

ll«o/|u < c7(|hu +||p|| W s '°°IM|l)j 

||« 0 / - w|| s < G(||p||l<»||u|| s+ i + IMIvr^lMk), 
ll„ „ /ipp(7) < C 

The function u o f~ 4 satisfies the same bounds. 


3 Weak Birkhoff normal form 


In this section it is convenient to analize the mKdV equation in the Fourier repre¬ 
sentation 


u( x ) = J2 


j 6 Z \{ 0 } 


Uje 1JX , u(x) <—*■ u := {uj) jeZ \ {0 y, u-j = uj, 


(3.1) 


where the Fourier indices are nonzero integers j, by the definition (11.51) of the phase 
space, and U-j = Tij because u(x) is real-valued. The symplectic structure (|1.6D 
writes 

Cl = — — duj A du-j, Cl(u, v ) = —UjV—j , (3.2) 


2 *—' 17 




ij 


the Hamiltonian vector field Xu in (11.31) and the Poisson bracket {F, G} in (11.71) are 
respectively 

[X H (u)\j = i jd u _.H(u), { F,G}{u) = — ^ ij(d u _ j F)(u)(d Uj G)(u). (3.3) 

3 +0 

We shall sometimes identify v = {vj)j & s and z = (zj)j^s c - 

The Hamiltonian of the perturbed cubic mKdV equation (11.11) is H = H 2 + H± + 
H >5 (see (II.41) 1 where 

H 2 (u) := f dx , H±(u) :=-s [ --dx , H> 5 (u) := f f(x,u,u x )dx, (3.4) 

JT 2 JT 4 JT 

? = ±1 and / satisfies (11.81) . According to the splitting (11.261) u = v + z, where 
v G Hs and z G Hg, we have H 2 (u) = H 2 (v) + H 2 (z) and 


H ^=-U 


v 2 z 2 dx — q I vz 6 dx — 


v 4 dx — ? f v 3 z dx — — f 
T J T 2 Jy 

For a finite-dimensional space 

E ■= E c := span^ : 0 < \j\ < C}, C> 0, 


/ vz 3 dx — - 
J T 4 J T 


z 4 dx. 


(3.5) 
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let II^ denote the corresponding L 2 -projector on E. 

In the next proposition we construct a symplectic map $5 such that the trans¬ 
formed Hamiltonian H := H o possesses the invariant subspace Hg defined in 
(ll.20p . and its dynamics on Hg is integrable and non-isocronous. To this purpose we 
have to eliminate the term f v 3 zdx (which is linear in z ) and to normalize the term 
f v 4 dx (which is independent of z ) in the quartic component of the Hamiltonian. 

Proposition 3.1 (Weak Birkhoff normal form). There exists an analytic invertible 
symplectic transformation of the phase space <&b '■ Ff(] (T x ) — > Hq(T x ) of the form 

$>b(u) = u + T(u), \H(u) = n B \I'(n B 'u), (3.6) 

where E is a finite-dimensional space as in (ESI).. such that the transformed Hamil¬ 
tonian is 

H:= H o$ B = H 2 + H 4 + H> 5 , (3.7) 

where Ho is defined in m, 



and TL >5 collects all the terms of order at least five in (v,z). 

Proof. In Fourier coordinates (13.11) we have (see (13.41) 1 

H °S U ) = \ S ^ J j 2 \u j \ 2 , H a [u) = —| u n u n u n u ]4 . (3.9) 

J# 0 il+j2+j3+j4=0 

We look for a symplectic transformation d> of the phase space which eliminates or 
normalizes the monomials Uj 1 Uj 2 Uj 3 Uj 4 of H 4 with at most one index outside S. By 
the relation j\ + j 2 + j ’3 + j 4 = 0, they are finitely many. Thus, we look for a map 
$ := (<&^V = i which is the time 1-flow map of an auxiliary quartic Hamiltonian 

F(u) := Fj 1 j 2 j 3 j i UjiUj 2 u j 3 Uj 4 . 

jl+j2+j3+ji=0 

The transformed Hamiltonian is 


n :=Ho^ = H 2 + U i + H> 5 , H 4 = {H 2 ,F} + H 4l (3.10) 

where H >5 collects all the terms in H of order at least five. By (13.91) and (13.31) we 
calculate 

^ 1 _ 4 _ ^2 + J3 T | U jl U j2 U j3 U ii ■ 

31+32+33+31=® 

In order to eliminate or normalize only the monomials with at most one index outside 
S, we choose 


F 


nj2j3M 


14 


4(ii + j 2 + jl + if) 

0 


if (ji,j2,h,k) e A, 

otherwise, 


(3.11) 
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where 


{(jl >42> 43 ,44) £ \ {O }) 1 : ji + J 2 + 43 + 44 — 0, jf + j§ + jf + jf 7 ^ 0, 

and at least three among Ji, J 2 , J 3 , J 4 belong to 5}. 

We recall the following elementary identity (Lemma 13.4 in [19] 1. 

Lemma 3.2. Let ji, j 2 , 43,44 € ^ such that j\ + j 2 + 43 + 44 = 0. T/ien 

4i +42 + 43 + 44 = 3(ji + 42X41 +43)(42 +43)- 

By dehnition 13.111) . LL 4 does not contain any monomial Uj^j^jgUj^ with three 
indices in S and one outside, because there exist no integers 41 , 42,43 £ S, 44 £ S c 
satisfying j\ + j 2 + 43 + 44 = 0 and jf + jf + jf + jf = 0, by Lemma [+2l and the fact 
that S is symmetric. 

By construction, the quartic monomials with at least two indices outside S are 
not changed by 4>. Also, by construction, the monomials Uj 1 Uj 2 Uj 3 Uj 4 in H 4 with all 
integers in S are those for which j\ + j 2 + 43 + 44 = 0 and jf + jf + jf + jf = 0. By 
Lemma 13.21 we split 


E 


u^u^u^uj^ = Ai + A 2 + A 3 


ji J 2 ,j 3 J 4 .es 

31 +42+43+44=0 
4 i+ 41 + 41 + 44=0 

where Ai is given by the sum over 41 , 42 , 43,44 £ S , ji + j 2 + 43 +44 = 0 with the 
restriction ji + j 2 = 0, A 2 with the restriction ji + j 2 7 ^ 0 and ji + 43 = 0, and A 3 
with the restriction ji + j 2 / 0, ji + 43 / 0 and j 2 + 43 = 0. We get 


^ 2 = E 

4,4+' 

4V- 

A 3 = E 


E 


Ei 


u 


’3 1 ) 


4,4+5 

4V+4 


4,4+5 

i2 1 |2 Y~^ 1 1 2 1 |2 o I |4 

/ . I+?I I U 4'I ^ ; 1+7 I : 

4,4+5 4G5 


4, = 2 I- 

4,4+5 


whence 13.81) follows. □ 

Remark 3.3. In the Birkhoff normal form for the Hamiltonian K = H + AM 2 
defined in 11.180 . three additional terms appear in 13.81) . which are 


A Y \uj\ 2 \u f \ 2 + 2XM(v)M(z) + \M 2 {z). 

3,3'&S 

Then in 13.81) the sum (A — fp) [>+ • 'e 5 |' u 4 | 2 |+ 4'| 2 vanishes if we choose A := 3?/4. □ 


4 Action-angle variables 

We introduce action-angle variables on the tangential directions by the change of 
coordinates 

Uj ■= yjij + \j\yj e 1 ^ for 4 £ S ; Uj := % for 4 £ S c , (4.1) 
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where (recall that U-j = Uj) 

i—j = ij i £j > 0 j O-j = Vj j &—j = —@j j 0* Vj € R, \/j £ S. (4.2) 

To simplify notation, for the tangential sites S + := {ji,..., } we also denote 

:= Vji ■= Vi, iji ■= ii, i = 1, • • • z'- 
The symplectic 2-form in (13.21) (i.e. (11.61) ) becomes 


W : = 


dOi A dyi -|— — dSj A 

*=i jeS c \{o} U 


V 

(XT ^ ^ (4.3) 

2=1 


where denotes the restriction of fi to (see (11.201) ) and A is the Liouville 
1-form on F x R 17 x Hg defined by Hg — > R, 

A (e,y,z)&yi*\ '■= -y -Q + \( d x 1 Z,z)L*(J) • (4.4) 


We rescale the “unperturbed actions” £ and the variables 0, y, z as 

£ = e 2 £, y = e 2b y , z = e b z, b > 1. (4.5) 

The symplectic 2-form in (14.31) transforms into £ 2b W. Hence the Hamiltonian system 
generated by H in (13.71) transforms into the new Hamiltonian system 


' 6 = d y H £ (6, y, z), 

< y = -d e H e (0,y,z), H e := e~ 2b U o A e , (4.6) 

J = d x V z H e (0,y,z), 

where 

A e (9,y,z) ■■= £v e (6,y) + e b z, v e {d,y) := ^ \]tj + e 2(6_1) \j\yj e ld H lJx . (4.7) 

j&S 


We still denote by 

X He = (d y H £l -d e H £ ,d x V z H £ ) 

the Hamiltonian vector field in the variables ( 9,y,z ) € T v xl" x H^. 

We now write explicitly the Hamiltonian H £ (9,y,z ) defined in (14.61) . Recall the 
expression of H given in (13.71) . The quadratic Hamiltonian H 2 in (13.41) transforms 
into 

£~ 2b H 2 o A £ = const + ^2 jeS+ j 3 yj + \ j T z * dx ’ ( 4 -8) 

and, by m, (EZD we get (writing, in short, v £ := v £ (0,y)) 


H e (9,y,z] 


= e(£) + <*(£) • y + \ [ zldx- ^-e 2 [ v 2 z 2 dx 
Z J T Z J T 

+ 3^ 2J (J E A 2 - jyjj'yf) -S£ 1+b f V £ z 3 dx 

j£S+ j,j'es+ 

-^e 26 [ z 4 dx + £~ 2b, H> 5 (£v £ (d,y) +e b z) (4.9) 

4 J T 
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where e(£) is a constant, and a(£) G M u is the vector of components 

a»(0 := J? + 3?e 2 [& - 2(fr + ... + £„)]*, t = 1,..., 1 /. 

This is the “frequency-to-amplitude” map which describes, at the main order, how 
the tangential frequencies are shifted by the amplitudes £ := (£ 1 ,... ,£ u )- It can be 
written in compact form as 

a(£) := Q + e 2 A£ , A := 3qDs(I — 2U), (4-10) 

where Q := G (see (11.191) ') is the vector of the unperturbed linear 

frequencies of oscillations on the tangential sites, Dg is the diagonal matrix 

D s '■= diag(ji, G Mat(^ x v ), 


I is the v x v identity matrix, and U is the v x v matrix with all entries equal to 
1. The matrix A is often called the “twist” matrix . It turns out to be invertible. 
Indeed, since U 2 = uU, one has (/ — 2U)(I — 2l ^_ 1 U ) = I, and therefore 

A ~ 1 = h( ! -2 ^i u W- <411) 

With this notation, one can also write 

\ j2y J ~ iVjhj' = ^{I-2U){Psy) ■ {D s y). (4.12) 

jes+ j,j'£S+ 


Remark 4.1. By remark 13.31 for the Hamiltonian K = H + A A l 2 , A := 3?/4, 
defined in ([1,181) the twist matrix in the frequency-amplitude relation (I4.10|) becomes 
A = 3$Dg, which is diagonal. □ 

We write the Hamiltonian in (14.91) (eliminating the constant e(£) which is irrel¬ 
evant for the dynamics) as H e = Af + P, where 


A/"(6», y, z) = a(£) • y + ^ (N(0)z, z) l2(j) , 

(N(0)z, ^) L 2 ( T ) : = J z ldx — 3?e 2 J v 2 (0,0)z 2 dx , 

describes the linear dynamics, and P := H s — AT, namely 

p ■= y e 2b (J_ 2 U){D s y) • ( D s y ) - £ 2 J^\v 2 (6,y) - v 2 (6,0)\z 2 dx 

- ?e 1+6 [ v e (6, y)z 3 dx -‘a e 2b [ z A dx + £~ 2b 'H> 5 (£v £ (6, y) + £ b z ), 
J T 4 Jj 

collects the nonlinear perturbative effects. 


(4.13) 


(4.14) 
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5 The nonlinear functional setting 

We look for an embedded invariant torus 

i : TT —>■ T u x r x Hif, cp i—>• i(tp) := {9 {ip), y{ip), z{p)) (5.1) 

of the Hamiltonian vector field Xh s filled by quasi-periodic solutions with diophan- 
tine frequency weK l/ , that we regard as independent parameters. We require that 
co belongs to the set 


H £ :=a([l,2n = {a(£):£e[l,2n (5.2) 

where a is the affine diffeomorphism (14.101) . Since any co € is e 2 -close to the 
integer vector Go G N 1 ' (see (I4.10p . (11.191) ). we require that the constant 7 in the 
diophantine inequality 

\co ■ l\> 7 (l)~ T , VZ € 7L V \ {0} , satisfies 7 = e 2+a for some a > 0 . (5.3) 

Note that the definition of 7 in (15.3D is slightly stronger than the minimal condition, 
which is 7 < ce 2 with c small enough. In addition to (15.3D we shall also require that 
co satisfies the first and second order Melnikov-non-resonance conditions (18.63D . 

We fix the amplitude £ as a function of co and e, as 

£ := £ - 2 A - 1 [u; — Go] , (5.4) 


so that a(£) = co (see (|4.10p ). 

Now we look for an embedded invariant torus of the modified Hamiltonian vector 
field Xfj s c = Xn e + (0, £, 0), £ € R", which is generated by the Hamiltonian 

H e ,c(6,y,z) ■.= H £ {9,y,z) + (-9, (Glh (5.5) 

Note that the vector held Xjj e c is periodic in 9 (unlike the Hamiltonian We 

introduce £ in order to adjust the average in the second equation of the linearized 
system (16.22D . see (I6.23D . The vector £ has however no dynamical consequences. 
Indeed it turns out that an invariant torus for the Hamiltonian vector held Xjj e c 
is actually invariant for Xjj e itself, see Lemma 16.11 Hence we look for zeros of the 
nonlinear operator 

X(i,C) ■= F{i,(,u,e) := V u i{p>) - X He {i{ip)) + (0, £,0) (5.6) 

V u 6{ip) - d y H e {i{<P)) \ 

Vuyiw) + 9gH e (i((p)) + £ 

'Duzfo) - d x V z H £ {i{<p)) ) 

V u Q{if) - d y P(i(tp)) 

Puyiv) + \d e {N{9{p>))z{ip), z{ip)) L 2 (T) + dgP(i{<p)) + £ 

Pljz(p) - d x N(6(ip))z(ip) - d x V~P(i(ip)) 

where &{<p) := 9{ip) — ip is (27r)"-periodic and we use (here and everywhere in the 
paper) the short notation 

V u :=co-d v . (5.7) 
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The Sobolev norm of the periodic component of the embedded torus 

3(+) '■= *(+) - (<+0,0) := {@{ip),y(v),z{tp)) , ©(¥>):= <%) - ip, (5.8) 

is ||3|| s := ||0||/fs + ||y||#s + ||z|| s where ||^|| s := is defined in (12.31) . We link 

the rescaling (14.51) with the diophantine constant 7 = e 2+a by choosing 

7 = e 2+a = e 2b ; 6 = 1 + (a/2), 0 € (0,1/6). (5.9) 

Other choices are possible, see Remark 15.21 

Theorem 5.1. Let the tangential sites S in (11.111) satisfy (11.121) . For all e € (0,£o), 
where £0 I s small enough, there exist a constant C > 0 and a Cantor-like setC £ C kl £ , 
with asympotically full measure as £ —>• 0, namely 

lin n io~~T = 1 ’ ( 5 - 10 ) 

£—>0 \il £ \ 

such that, for allu G C £ , there exists a solution ioo(+) := ioo{w,e)(ip) of the equation 
F(ioo,0,uj,£) = 0 (the nonlinear operator F(i,Q,oj,e) is defined in (15.61 ) ). Hence 
the embedded torus + 1 —> ioo(+) is invariant for the Hamiltonian vector field Xjj e , 
and it is filled by quasi-periodic solutions with frequency ui. The torus ioo satisfies 

ll*oofa) - (+,0,0)11^ < Ce 5 ~ 2b 7 _i = Ce l ~ 2a (5.11) 

for some + := +( u) > 0. Moreover, the torus i^ is linearly stable. 

Theorem 15.11 is proved in sections [6]|9l It implies Theorem 11.11 where the fj 
in (11.131) are the components of the vector A” 1 ^ — ui\. By (15.111) . going back to 
the variables before the rescaling (14. 5p . we get ©oo = 0(e 5 " 46 ), y^ = 0(e 5 ~ 2b ), 
5oo = 0(e 5 ~ 3b ). 

Remark 5.2. The way to link the amplitude-rescaling (14.51) with the diophantine 
constant 7 = e 2+a in (15.3|) is not unique. 

The choice e 2b < 7 (i.e. “6 > 1 large”) reduces to study the Hamiltonian H e in 
(14.91) as a perturbation of an isochronous system (as in [2j2], 121] , [23) ■ We can take 
b = 4/3 in order to minimize the size of the perturbation P = 0(e 7 / 3 ), estimating 
uniformly all the terms in the last two lines of (14.91) . As a counterpart we have 
to regard in (14.91) the constants a := a(£) € M 1 " (or f in (14.71) ) as independent 
variables. This is the perspective described for example in m- Then the Nash- 
Moser scheme produces iteratively a sequence of £ n = £ n (w) and embeddings + 
in{p) '■= (On(T),yn(p),Zn(p)) at the same time. 

The case e 2b > 7 (i.e. “6 > 1 small”), in particular if b = 1, reduces to study 
the Hamiltonian H £ in (14.91) as a perturbation of a non-isochronous system a la 
Arnold-Kolmogorov (note that the quadratic Hamiltonian in (14.121) satisfies the usual 
Kolmorogov non-degeneracy condition). In this case, the constant £7 in (14.71) and 
the average of |j |;+ (+) have the same size and therefore the same role. Then we may 
consider £j as fixed, and tune the average of the action component yj(<p) in order to 
solve the linear equation (I6.28|) . which corresponds to the angle component. We use 
the invertible (averaged) “twist”-matrix (16.301) to impose that the right hand side 
in (16.281) has zero average. 
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The intermediate case e 2b = 7, adopted in this paper (as well as in 0), has the 
advantage to avoid the introduction of the £(w) as an independent variable, but it 
also enables to estimate uniformly the sizes of the components of (@(99), y(ip), z(ip)) 
with no distinctions. □ 

Now we prove tame estimates for the composition operator induced by the Hamil¬ 
tonian vector fields Xjg- and Xp in (15.611 . which are used in the next sections. Since 
the functions y 1 —> \/ f, + e 2 ( b_1 ) \j\y, 6 1-7 e ld are analytic for e small enough and 
\y\ < C, the composition Lemma 12.21 implies that, for all Q,y £ H S (T 1 ', M^) with 
l|0||s o , II2/IU0 < 1, setting 9(<p) := tp + ©((/?), one has the tame estimate 

\\ve{0{ip),y(tp))\\ s < s 1 + ||©|U + ||2/|U • 

Hence the map A e in (14.711 satisfies, for all ||3||s 0 ip ^ < 1 (see (15.811 1 

IM e (»(rf,i/(v),z(rf)ll, L,pW <• E(1 + PIL L,pW ) . (5.12) 

In the following lemma we collect tame estimates for the Hamiltonian vector fields 
Xj\f, Xp, Xp e (see (|4. 1311 . (I4.14|) l whose proof is a direct application of classical 
tame product and composition estimates. 

Lemma 5.3. Let 1(93) in (15.811 satisfy ||3||^+3^ < Ce 5 ^ 2b 'y~ 1 = Ce 5 ~ 4b . Then, 
writing in short || || s to indicate || ||s ip ^ 7 \ one has 

\\d y P(i)\\s < S s 3 + e 2b p\\s + 3 ||3eP(*)||» < s e 5 ~ 2b (l + ||J|| s+3 ) 

IIV z P(i)|| s < s e 4 ' 6 + e 6_3b ||3|| s+ 3 ||X P (*)|| a < s e 5 ~ 2b + £ 2b ||J||, + 3 

II d e d y P(i)\\ s < s e 3 + e 5 ” 2fe ||J|| s+3 ll^j/V z P(*)|| s < s e 2+b + e 26 ||J|| s+3 

|| d yy P(i) - e 2fo AL» 5 || s < s e 1+2b + e 3 ||l|| s+3 

(A,Dg are defined in (14.1011 ) and, for alTi:= ( Q,y,z ), 

\\d y diXp(i)\f]\\ s < s £ 26 (||?||<j+ 3 + |p|U +3 ||T|| S0 +3) (5.13) 

II diX H fii)\i} + (0,0,d xa:a: z)|| s < s e 2 (||T|| s+3 + |p|| s+3 ||?|| So+3 ) (5.14) 

I Mi (*)[*>*] || s 5~s e (||* ||s+3 II* llso+3 + IPHs+3 ||* IL0+3) • (5.15) 

In the sequel we also use that, by the diophantine condition (15.311 . the operator 
Vfi 1 (see (15.711 1 is defined for all functions u with zero ^-average, and satisfies 

WP^uWs < C , 7 _1 ||m|| s+t , IIP-^II^pW < C'7 _ 1 ||n||^ p 2 ( 7} 1 . (5.16) 

6 Approximate inverse 

In order to implement a convergent Nash-Moser scheme that leads to a solution of 
J-(i,Cf) = 0, we now construct an approximate right inverse (which satisfies tame 
estimates) of the linearized operator 

di£p(io, Co)[a C] = T> u t- diX He (i 0 {ip))^\ + (0,C,0), (6.1) 

see Theorem 16.91 Note that di^P(i o,Co) is independent of Co (see (15.61) 1. 
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The notion of approximate right inverse is introduced in m- It denotes a linear 
operator which is an exact right inverse at a solution (*o,Co) of T(io,Co) = 0. We 
implement the general strategy in |T0j which reduces the search of an approximate 
right inverse of (16.111 to the search of an approximate inverse on the normal directions 
only. 

It is well known that an invariant torus iq with diophantine flow is isotropic (see 
e.g. |10j). namely the pull-back 1-form *qA is closed, where A is the Liouville 1-form 
in (14.41) . This is tantamount to say that the 2-form W (see (14.31) 1 vanishes on the 
torus ioCT 17 )) because i^W = i^dA = di$ A. For an “approximately invariant” torus 
io the 1-form i^A is only “approximately closed”. In order to make this statement 
quantitative we consider 


i* 0 A = 

ak(v) ■= 


'y ^ ^ a k{.^p)dipk , 

- ([d v 0 o (<p)} T y o (v)) k + \{dvkZo(v),d~ l z q (lp)) L2{J) 


and we quantify how small is 


ZqW = dil A = ^ A k j(v)dtpk A dipj , A kj := d^ k a 3 - <9^a fc . 

1 <k<j<iz 


( 6 . 2 ) 


(6.3) 


Along this section we will always assume the following hypothesis (which will be 
verified at each step of the Nash-Moser iteration): 

• Assumption. The map oj i—>• ?o(w) is a Lipschitz function defined on some subset 
Vt Q C U £ , where Tl £ is defined in (15.21) . and, for some p := p(r, v) > 0, 


Lip(7) si 5—2b —1 _ /~i b—Ab 

so+fl A ue 7 — oe , 


I 7ii Li p(7) ^ r<c 5 ~ 2b 


(6.4) 


7 = £ 2+a = e 2b , 


6:=l + (a/2), a G (0,1/6), 


where JqO/ 3 ) : = * o(<p) — (<^, 0, 0), and 


Z(ip) := (Z 1 ,Z 2 ,Z 3 )(<p) := T(i 0 , Co){v) = w • <Vo(<A) ~ x h E} Co (*o(¥>)) (6.5) 

is the “error” function. 

Lemma 6.1 (Lemma 6.1 in [5]). |Co| Lip< ' 7 ^ < C'||Z||so P ^. If ^{io, Co) = 0, then 
Co = 0 , and the torus io(f) is invariant for Xh s - 

Now we estimate the size of igW in terms of Z. From (16.21) . (16.31) one has 
l|Afcjl|s iP ^ IPolls+2 7 ^- Moreover, A k j also satisfies the following bound. 

Lemma 6.2 (Lemma 6.2 in [5]). The coefficients A k j(ip) in (16.31) satisfy 

<, i-yiuiihiA+ iizik?ip»ii^) • ( 6 - 6 ) 

As in [TO], we first modify the approximate torus io to obtain an isotropic torus 
is which is still approximately invariant. We denote the Laplacian := J2k=i 

Lemma 6.3 (Isotropic torus). The torus is(<p) ■= (0o(<p), ys{f)^ z o(f)) defined by 

VS ■= Vo + [d v B 0 ((p)]~ T p((p ), pj((p) := ( 6 - 7 ) 
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is isotropic. If (E3D holds, then, for some a := a{v, t), 

llw - »ll. LlpW <• l|3olt?i 7) , (6.8) 

llw ~ »C‘ p(7 > <» 1- 1 {\\Z& ] + IIZIlK’lPollS 11 } , (6.9) 

l|J 7 (ij.Co)ft‘ p(l) <• IIZIlS' 7 ’ + IPollS 7) l|Z|l“+ ( ? . (6-10) 

iifli[w]Hii.<.n- +Pollen.. (6.H) 


In the paper we denote equivalently the differential by di or dj. Moreover we 
denote by a := a{v,r) possibly different (larger) “loss of derivatives” constants. 

Proof. It is sufficient to closely follow the proof of Lemma 6.3 of [5j. We men¬ 
tion the only difference: equation (6.11) of [5] is ||h ?r (^ < 5,Co)||s" iP< ' T ' > < s ||Z’Hs+i 7 ' > + 

e 2b ~ 1 7“ 1 113 0 11^ 1 11Z|, with a big factor e 2b_1 7 _1 = e _1 more with respect 
to the present bound (16.101) . In (16.101) there is no such a factor, because, by 
the estimates for dgd y P, d yy P, d y \7 Z P in Lemma PTTTTl here we have ||(9 ?/ Xp(z)|| s < s 
e 26 (l + 11U11 s -|_3). Hence (16.81) . (16 91) . ()6.4|) imply that 

\\X P (i s ) - X P (i 0 )\\ s —S ||^||s+cr ||-^01|s+cr \\^\ |so+cr • (6.12) 

Then the proof goes on as in [5], without the large factor s 2b ~ 1 'y~ 1 . □ 

In order to find an approximate inverse of the linearized operator di^J-(is) we 
introduce a suitable set of symplectic coordinates nearby the isotropic torus is- We 
consider the map G$ '■ (ip, rj, w) ( 6 , y, z ) of the phase space F x F x Hg defined 
by 

m aa / wo \ 

y I := G s ?7 := I y 5 fif) + [d^0 o (ip)]- T rj + [(d e z 0 )(9 0 (ip))] T d^w I (6.13) 

\z) \wj \ z 0 (iP)+w ) 

where zo(9) := zo(@ ( 7 1 (0))- It is proved in [10] that G$ is symplectic, using that the 
torus ig is isotropic (Lemma 16.31) . In the new coordinates, is is the trivial embedded 
torus ( ip,rj,w ) = (ip, 0,0). The transformed Hamiltonian K := K (ip, y, w, Co) is 
(recall ([5.51) 1 

K:=H £Xo oG s (6.14) 

= #o (VO • Co + Kooi'if) + Kioi'ip) ■ T] + (Adi(V0, A)l 2 (t) + \K 20 (ip)y • y 

+ (Adi(^)77 ,w) L 2 (T) + \{K 02 (ip)w,w) L 2(T) + K> 3 (ip,y,w) 

where AT>3 collects the terms at least cubic in the variables ( y,w). At any fixed ip, 
the Taylor coefficient Aoo (VO € K, Ado (VO £ KT, A01 (VO € Hg (it is a function of 
x € T), Ado (VO is a v x v real matrix, Ko 2 (ip) is a linear self-adjoint operator of Hg 
and A'n(V’) : —x Hg. Note that the above Taylor coefficients do not depend on 

the parameter Co- 

The Hamilton equations associated to ([6.141) are 

V> = Ado (VO + Ad o(ip)y + k TiW™ + d v K> 3 (iJ), y, w) 

I] = -[d^e 0 (ip)] T Co ~ d^K 00 (ip) - [d^K w (ip)] T y - [d^K m (ip)] r w 
< -d^{±K 20 (ip)y ■ y + (K u (ip)y,w) L 2 m + l(K 02 (ip)w,w) L 2 m (6.15) 

+K> 3 (ip,T],w)} 

> = d x (Adi {ip) + Ku(ip)r] + K 02 (ip)w + V w K> 3 (ip,y,w )) 
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where [d^Kio(i/;)] T is the v x v transposed matrix and the operators [d^Koi(ip)] T 
and Kj x (i/j) : Hg —>• are defined by the duality relation w ) L 2 = 

if ■ [d^Koi(v/j)] T w, for all if £ R", w £ and similarly for K\ \. Explicitly, for all 
w £ Hg, and denoting e k the k- th versor of R 1 ', 

V V 

‘ ^k)e k = Y ( w > Kl iW^fc)^2 (t)— fc e • 

/c=l /c=l 


In the next lemma we estimate the coefficients -Koch -Kioj -Koi of the Taylor expansion 
(|6.14l) . Note that on an exact solution we have Z = 0 and therefore i^ooCVO = const, 
K 10 = oj and Kq\ = 0. 


Lemma 6.4. Assume (16.41) . Then there is a := <t(t, v) such that 


\\d^K 00 \\^\ \\K 10 - W ||^pW, ||Koi||^ ip(7) <* 


II 7 ||Lip(7) I || 7 ||Lip(7) 
ll^lls+cr ' ll Zy llso+ cr 


Lip(7) 

s+cr 


Proof. Follow the proof of Lemma 6.4 in [5|. The fact that here there is no factor 
£ 2b ~ lr )~ l is a consequence of the better estimate (16.101) for P{ig, Co) compared to the 
analogous estimate in m • □ 

Remark 6.5. If J~(io,Co) = 0 then Co = 0 by Lemma 16.11 and Lemma 16.41 implies 
that (16.141) simplifies to the normal form 

K = const + u ■ ri + ^K 20 (ip)rr V + ( k iiWv,w) L 2 {t:) + 7^02 (VO™! ™)l 2 (T) + K >3 ■ 


□ 


We now estimate K 2 q , K\ 1 in (16.141) . The norm of K 2 q is the sum of the norms 
of its matrix entries. 


Lemma 6.6. Assume (16.41) . Then 

11*20 - £ 2 *AD S ||^W <, £ 2 ‘« + ^IPoll^-W , (6.16) 

ll*il-7lli lp<11 <> E 5 -“ll"7lli lp(7) +£ 2 ‘|Po|K 1 > IM|“ p(7) , (6.17) 

ii*(i»iii lpl7) <. ^imiSF+ 6 2t iPoii. L ; p i 7) wii; p + ( 2 7) • ce.is) 

In particular \\K 20 ~ £ 26 A£>s||so P ^ < Cs 5-26 , and 

ll*illll? l7) < C£ 5 - 2i ||-)ll? (7) , ll*£w|lS p(7) < C£ 5 - 2t ||»||i‘ p(7) . 

Proof. See the proof of Lemma 6.6 in ®. □ 

Consider the linear change of variables (6 ,y,z ) = DGs(<p,0,0)[if,ff,w], where 
DGg(ip, 0,0) is obtained by linearizing Gg in (16.131) at (ip, 0,0), and it is represented 
by the matrix 

/9/oH 0 0 \ 

DG s (tp,0,0) = Id^ysfo) [d^e 0 (<p)}~ T -[(dez 0 )(9 0 (ip))] T df 1 . (6.19) 

\d^z 0 (<p) 0 I ) 
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The linearized operator di^F(is, Co) transforms (approximately, see (16.401) 1 into 
the operator obtained linearizing (16.150 at (ip, q, w, C) = (</?, 0,0, Co) (with <9 t -w V w ), 
which is the linear operator 


B$,r),w,C\ 


B r 2 (C, q, w, C] , 

\B3$, % W,C\) 


where 


Bi := vjp - d^K w (<p)[ip} - K 20 (ip)q - (6.20) 

B 2 ■= T>J) + [d^9 0 (ip)] T C + d^[d^0 o (ip)] T [ip,(o} + d^K 00 ((f)[ip] 

+ [dipK 10 ((p)} T rj+ [d^K 01 (ip)] T w, 

B 3 := V^w - d x {d^K 01 (ip)[ip\ + K u (ip)r} + K 02 ((p)w}. 


Lemma 6.7 (Lemma 6.7 in [S]). Assume (16.41) and letf:= (ip,fj,w). Then 

\\DG s (<p, 0,0)[7]|| s + ||Z7G <5 (^,0,0)- 1 [i]|| s < s |f?|| s + || 3 0 || s+ff ||*k, (6-21) 

||D 2 G,5(^,0,0)|?i,?2]|| S <s 11*1 IU 11*2lUo + ll*l|l SO ll z 2 ||s + II-Jo 11s+cr 11*111so II*2 IIs 0 

for some a := a(u,r). The same estimates hold for the || ||s ip ^ norm. 

In order to construct an approximate inverse of (|6.20l) it is sufficient to solve the 
equation 


/ B.Jp - K 20 (tp)fj - Kj x (tp)w \ /gA 
H)[f,q,w,C] ■= I VJq+ [5 v ,6»o((^)] t C = 52 

XD^w - d x K u (ip)q - d x K 02 (v)wj \W 


( 6 . 22 ) 


which is obtained by neglecting in B\,B 2 ,B 3 in (16.201) the terms d^K\o, B^Kqq, 
d^Koo, d^Koi and d^[d^Oo(tp)] T [■, Co] (these terms are naught at a solution by Lem¬ 
mata 16.41 and 16.11) . 

First we solve the second equation in (I6.22p . namely 'DJq = g 2 — [d^0o(ip)] T C ■ 
We choose C so that the (^-average of the right hand side is zero, namely 

c = <02> (6.23) 

(we denote (g) := (2i t)~ u f T „ g((p)dp). Note that the (^-averaged matrix ([d^6o] T ) 
= (I + [<9^,©o] T ) = I because 0o(tp) = ip + @o(t) and &o(t) is a periodic function. 
Therefore 

V '■= VZ 1 (92 - [d^0 0 (p)] T (g 2 )) + (q), (q) <E I" , (6.24) 

where the average (rj) will be fixed below. Then we consider the third equation 


CujW = g 3 + d x K u (ip)q , := to ■ - d x K 02 (ip ). (6.25) 

• Inversion assumption. There exists a set c such that for all c o € Hooj 
for every function g £ H S S ^(T U+1 ) there exists a solution h := C,~ x g £ H s g± (T u+1 ) 
of the linear equation LJn = g, which satisfies 

IIC 1 sll! J ‘’ (7) < CWT-'dlsIllfF +sV 1 || 3 ollh P l 7 ) Hsie i,<l) ) (6.26) 
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for some p := p(r, v) > 0. 

By the above assumption there exists a solution 

w'■= + d x K u (p)ff] (6.27) 

of (16.251) . Finally, we solve the first equation in (I6.22|) . which, substituting (16.241) . 
(16.271) . becomes 

'£>„'$ = gi + M l (p)(rj) + M 2 (tp)g 2 + M 3 (ip)g 3 - (g 2 ) , (6.28) 

where 

■= K 20 (<p) + , M 2 (<p) := Mi(^)P -1 , 

rr , (6.29) 

m 3 M 

To solve equation (I6.28P we have to choose (ff) such that the right hand side in (16.281) 
has zero average. By Lemma 16.61 and (16.41) . the p- averaged matrix 

(Mi) = e 2b AD s + 0(e 5 ~ 2b ) . (6.30) 

Therefore, for e small, (Mi) is invertible and (Mi ) -1 = 0(e~ 2b ) = 0( 7 -1 ) (recall 
(15.91) ). Thus we define 

(V) ■■= ~(Mi)~ 1 [(gi) + (M 2 g 2 ) + (M 3 g 3 ) - (M 2 [dpd 0 } T )(g 2 )}. (6.31) 

With this choice of (fj), equation (16.281) has the solution 

$ ■= V^igi + Mi (p)(rf) + M 2 {p)g 2 + M 3 (p)g 3 - M 2 (p)[d^d 0 ] T (g 2 )). (6.32) 

In conclusion, we have constructed a solution (if, rj , w, () of the linear system (16.221) . 

Proposition 6.8. Assume (16.41) and (16.261) . Then, \/lo € floo; Vg := ( 51 ,^ 2 , 53 ), 
the system (16.221) has a solution B -1 g := (if,rj,w,() where (if,rj,w,() are defined 
in (16.321) . (16.241) . (16.311) . (16.271) . (16.231) . and satisfy 

IID-'sll^) <. 7 - 1 (ll 9 llS! 7> +^ 7 - I |PollS<' ,) |lsllK ) ). ( 6 . 33 ) 

Proof. Recalling (16.291) . by Lemma [6761 (16.261) . (16.41) we get ||M 2 /i|| So + ||M 3 h|| So 
< C||/i|| S0+o -. Then, by (16.311) and (Mi ) -1 = 0(e~ 2b ) = 0( 7 -1 ), we deduce 
|(^)|Lip( 7 ) < C 7 -1 ||5llso+ ( a- ) and ( | 6-^4 | ) , d5T6P imply || 7 ||s ip(7) < s 7 _1 (Il3lls+i 7) 
+ ||3o||s+o-||ff ||so P ^ 7 ^)• The bound (16.331) is sharp for uj because Cf l g 3 in (16.271) is 
estimated using (16.261) . Finally if satisfies (I6.33P using (16.321) . (16.291) . (I6.26p . (15.161) 
and Lemma 16.61 □ 

Let Gs(if,g,w,() := (Gs(if,g,w),Q). Let || (if,g, w, C)||s ip ^ denote the maxi¬ 
mum between || (if, 77 , u>)||!( ip ^ 7 ' > and |<C| Li p( 7). We prove that the operator 

T 0 := (DG s )(p, 0 , 0 ) o D -1 o (DG s )(<p, 0, 0) -1 (6.34) 

is an approximate right inverse for di^J~(io). 
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Theorem 6.9. (Approximate inverse) Assume (16.41) and the inversion assump¬ 
tion (|6.26l) . Then there exists g := h(t,v) > 0 such that, for all u G floo, for all 
g := (<71, g 2 ,gfi), the operator Tq defined in (|6.34l) satisfies 


IITosll^M <.7-4ll9llS 1, +=V 1 |PolK ,) ll»ll'‘S ) ). {6.35) 

The operator To is an approximate inverse of di^Ffifi), namely 

||(d i , c ^(i„)oT„-/) 9 ||^W (6.36) 

<.7- 1 |l^(io.Co)ll“ P + ( ?ll9ll»f“ 

+ 7-‘{ \\X (io, Co) + £ V 1 W>o, Co) ififf IPo ll^ 7) } llslfif? • 


Proof. In this proof we denote || || s instead of || || , : qi 33ft follows 

from (16.341) . (16.331) . (16.211) . By (15.61) . since Xjg does not depend on y, and is differs 
from only for the y component, we have 


~ di^F(is)[i, C] = diX P (i s )[T] - diX P (i 0 )[T] (6.37) 

= / dydiX P (9 0 ,y 0 + s(ys-y 0 ),zo)[ys-yoS]ds=:£o[i,C]- 
Jo 

By HEED, EHD, El, (El , we estimate 

||^o[*)C]l|s <S II^IUo+O-Ir*||s+£T + (ll^lls+CT + ll^'||so+Cr||3o||s + f7)||*l|so+cr (6.38) 

where Z := ^(zojCo) (recall (16.51) 1. Note that £o[*">C] i s , i R independent of 
(. Denote the set of variables (ip,r],w) = : u. Under the transformation Gs , the 
nonlinear operator T in (15.61) transforms into 


F{Gs( n(<^)), C) = DG 5 (u{ip)) (^u (ip) - X K (n(ip), 0), (6.39) 


where K = H e ^ o Gs, see (16.141) - (16.15f) . Differentiating (16.391) at the trivial torus 
u,s(</j) = Gj 1 ^)^) = (tp, 0, 0), at C = Co, in the direction (u, C ) = (DG^u^)' 1 [?], C) = 
DGs^s^ihC), we get 

d iX T(i s )[fiC] =DGs(vl 5 )('D uj u - d Ut( X K (u 5 ,Co)[^,C]) + £i[bC], (6-40) 

fi[t,C] -DiGsius^DGsius)- 1 His,Co), DG^K)- 1 !?]] , (6-41) 


where d u ,c^A'( u <5, Co) is expanded in (16.201) . In fact, £\ is independent of C- We split 

VJfi- d UiC Ax(u 5 ,Co)[u,C] =B[u,C] + -Rx[u,C], 

where B[u, C] is defined in (I6.22P and Rz[if,rf,w, C] is defined by difference, so that 
its first component is —d^Kio(ip)fy\, its second component is 

d 4 ,[d^ 6 0 ((p)} T [fi,( 0 ] + d^K 00 ((p)[fi] + [d^K 10 (ip)] T ff + [d^K 01 (ip)] T w, 


and its third component is —d x {d$KQi(ip)\fi}\} (in fact, Rz is independent of C). By 
(1071) and rfOUD . 


di£X(io) = DG s (vis) o B o DGj(u 5 ) 1 + £ 0 + £ x + £ 2 , 
£2 -DGs^oRzoDGsins)- 1 . 


(6.42) 
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By Lemmata 16.41 IfTTl 16.11 and (16.101) . (16.41) . the terms £ 1,62 satisfy the same bound 
(|6.38l) as £q. Thus the sum £ := £$ + £\ + £2 satisfies (|6.38l) . Applying To defined 
in (|6.34l) to the right in (16.421) . since B o B _1 = / (see Proposition 16.81) . we get 
o To — I = f o Tq. Then (16.361) follows from (16.351) and the bound (16.381) 
for £. □ 

7 The linearized operator in the normal directions 

The goal of this section is to write an explicit expression of the linearized operator C u 
defined in (16.251) . see Proposition 17.41 To this aim, we compute u ’)l 2 (t)) 

w £ Hg, which collects all the terms of ( H e o G^)(V',0, re) that are quadratic in w, 
see (16.141) . We first recall some preliminary lemmata. 

Lemma 7.1 (Lemma 7.1- [5] ). Let H be a Hamiltonian function of class C 2 (Hq(T x ),W) 
and consider a map <h(u) := u + T(u) satisfying T(u) = He^IIeu), for all u, where 
E is a finite dimensional subspace as in (13.51) . Then 

d u \V(Ho<f>)](u)[h] = (d u VH)($(u))[h\+K(u)[h], (7.1) 

where lZ{u) has the “finite dimensional” form 

n{u)[h] = E m< J h i9j( u )) L 2(j)Xj( u ) ( 7 - 2 ) 

with Xji u ) = e 1JX or gj(u ) = e 1Jx . The remainder in (17.21) is lZ(u) = lZo{u)+lZ\{u) + 
IZ 2 {u) with 

Ko(u) := (9 u VA)($H)a u f(«), n x (u) := [<9 u {T'(u) t }][-, VH($(u))], 
n 2 (u) := [d u ^(u)} T (d u VH)mu))d u ^(u). (7.3) 

Lemma 7.2 (Lemma 7.3 in [5]). Let 1Z be an operator of the form 

Kh = E /> > 9j ( r )) L 2 ( T ) Xj { r)dT , (7.4) 

I j \< C Jo 

where the functions gj(r), Xj( T ) € H s , r £ [0,1] depend in a Lipschitz way on the 
parameter uj. Then its matrix s-decay norm (see m-m) satisfies 

|K|^ M <, X) sup (llu( T )lll lp<7> ll®( T )ll» p<1> + IIXjMC P<l) ll®Mlll‘ p<7) )' 

|j|<C ,re [°’ 1 ] 

7.1 Composition with the map G$ 

In the sequel we use the fact that 3s ■= 3 , 5 ( 9 ?; w) := ^(t 3 ;^) — (¥?, 0,0) satisfies, by 
(16.81) and (16.41) . 

IPdlso+/? < Ce 5 ~ 2 b 'y ~ 1 = Ce 5 ~ 4b . (7.5) 

In this section we study the Hamiltonian K := H e o Gs = s~ 2b H o A e o Gs defined 

in (|6.14|) , (14.61) . Recalling (14.71) , (16.131) , A e o Gs has the form 

A e (Gs(.if,r],w)) = Ev e (0 o (if), ys(if) + Li(if)ri + L 2 (if)w) + e 6 (zo(V0 + w) (7.6) 
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(7.7) 


where v £ is defined in (I4.7f) . and 

Li(^) := [d^0 o {^)]~ T , L 2 {iji) := [(d e z 0 )(6o(4>))] T d^ 1 . 

By Taylor’s formula, we develop (17.611 in w at ( 77 , w) = (0, 0), and we get 
(A e o Gg)(xf,0,w) = Tg(i/}) + Ti(i/))w + T 2 (fi)[w,w\ + T> 3 (i/j,w) , 


where 

Tg(i/j) := A e (Gg(ip, 0 , 0 )) = evg(tp) + £ b z 0 (i>), vg(ip) ■= v e ( 6 0 {ip),yg(ip)) (7.8) 

is the approximate isotropic torus in the phase space 77 ( J (T) (it corresponds to ig in 
Lemma 16.31) , 

Ti(fi)w := £ 2 h ~ 1 Ui('iJj)w + s b w, T 2 {'fi)[w^w\ := £ 4 b ~ 3 U 2 (fi)[w, w\ 




U 2 (fi)[w,w\ 


y' \j\[L 2 (^)w\j ^ jx 

j£S 2y^ + £ 2 ( 6 - 1} \j\[yg(ip)\j 

j 2 [L 2 (-fi)w} 2 e^(^ .. x 

f^s^ + e^mysW}.,}^ 6 ’ 


(7.9) 

(7.10) 


and T> 3 (i/j, w ) collects all the terms of order at least cubic in w. The terms U±,U 2 = 
0(1) in e. Moreover, using that L 2 (i/j) in (17.711 vanishes as zq = 0, they satisfy 


IIEMI. — s 11 -3(5 11 « 11 11 ^0 + II^IUMI* > 

\\U 2 [w,w]\\ s < s ||3Ttf|U||3TtflUoll^llao + M2 0 MkMI» 

and also in the || ||s ip ' : ' l -norm. We expand 77 by Taylor’s formula 

U{u + h) = U{u) + ((V77)(u),/i) £ , 2 (t) + l((d u V'H)(u)[h],h) LH T) + 0{h 3 ). 

Specifying at u = Tg(ip) and h = T\ (iJj)vj + T 2 (ip)[w, w] + T> 3 (ip, w), we obtain that 
the sum of all the components of K = e ~ 2b (?7 oyl £ o Gg)(if>, 0, w) that are quadratic 
in w is 

\(Kq 2 w, u ;) l 2 ( t ) = ^~ 2b (iyT~l-)(Ts),T 2 [w,w}) L 2(j) 

+ £- 2 b ^((d u vn)(T s )[T 1 w},T 1 w) L 2 m . 

Inserting the expressions (17.911 . (17.1011 in the last equality we get 

KwMw = (. d u yn){Tg)[w} + 2 e b - 1 (9 11 V77)(T 5 )[C/iu;] (7.12) 

+ E^ b -Vu{{d u V'U){Tg)[U lW \ + 2 £ 2b ~ 3 U 2 [w, -] t (V77)(T 5 ). 


Lemma 7.3. The operator K 32 reads 


[Kq 2 (iI))w,w) L 2 {T ) = ((d u VH)(Tg)[w\,w) L 2 (y) + (77(V’)u;,^)l2( T) (7.13) 

where R{fif)w has the “finite dimensional” form 

R(ip)w = ^2 ljl<c {w,9jW) L 2 (T) XjW- (7.14) 
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(7.15) 


The functions Qj,Xj satisfy, for some a := <r(z/, t) > 0, 

ii»n“ p( 7 ) iMi« p(7) + ii»ii, L : p( 7 ) iixiii^ p<7) <> ^‘piii^i 7 ’. 

II^jKIMIxjIUo + \\ d i9M\so\\xj\\s + Ilft-IUoll^iXjPlIU + Iki IUII^x^IaI IU 0 

< s£ 6+1 (||^| s+(T + ||J,|| s+ff ||^|, 0+(T ), (7.16) 

where i = (9, y, z ) (see (15.11 ) ) and T = (9, y, z). 

Proof. Since U\ = II 5 C /1 and U 2 = II 5 C/ 2 , the last three terms in (17.121) have all the 
form (17.141) . We have to prove that they are also small in size. 

By (14.81) . (16.131) . (17.71) . the only term in e~ 2b H 2 (A e (Gg('ip, rj, w))) that is quadratic 
in w is \ f T w 2 dx, so this is the only contribution to (17.121) coming from H 2 . 

It remains to consider all the terms coming from H >4 := Hi + H >5 = 0(u A ). 
The term £ 6 - 1 d u V%> 4 (T$)£/i, the term £ 2 ^ b ~ l ^>Uf (d u VH>i)(Tf)U\ and the term 
e 2b - :i UtVH>i(Tf) have all the form (17.141) and, using the inequality ||T ( 5 ||s ip< ' 7 ' ) < 
e(l + || 3 f 5 ||s ip( ' 7 ' ) ), (17.111) and (16.4|) . the bound (17.151) holds. By (16.111) and using 
explicit formulae (17.7D - (17.101) we get (17.161) . □ 

The conclusion of this section is that, after the composition with the action-angle 
variables, the rescaling (14.51) . and the transformation Gs, the linearized operator to 
analyze is w (d u \7H)(T t 5 )[re], w £ Hg, up to finite dimensional operators which 
have the form (17.141) and size ([7.151) . 

7.2 The linearized operator in the normal directions 

In view of (17.131) we now compute ((d u \7H)(Ts)[w], w) L 2 (j^, w £ Hg, where Ti = Ho 
and &B is the Birkhoff map of Proposition 13.11 We recall that &b(u) = n + T(rt) 
where T satisfies (13.61) and T(u) = 0(u 3 ). It is convenient to estimate separately 
the terms in 

H = H o <&b = H2 o <&b + H4 o <&b + H >5 o <&b ( 7 . 17 ) 

where H 2 , # 4 , are defined in (13.41) . 

We first consider H >5 o <J>£. By (13.41) we get Vi7>s(n) = iro[(d u f)(x,u,u x )\ 
~d x {(d Ux f)(x,u,u x )} where ttq is the operator defined in (ll.32|) . Since <&b has the 
form (13.61) . Lemma 17771 fat u = T$, see (17.81) ) implies that 

d u V(H> 5 o^ B )(T 5 )[h\ = (d u VH> 5 )^ B (T s ))[h} + n H ^(T s )[h\ 

= d x (n(T s )d x h) + r 0 (T s )h + n H > 5 (T s )[h] (7.18) 

where the multiplicative functions ro(T< 5 ), n(T< 5 ) are 

ro(T s ) := <t 0 ($b{T s )), n(T s ) := ^(^(T*)), (7.19) 

a 0 (u) := (. d uu f)(x,u,u x ) - d x {(d UUx f){x,u,u x )}, 
ai (u) := ~(d UxUx f)(x,u,u x ), 

the remainder 1Zh >6 (u) has the form (17.21) with Xj = elJX or Sj = e 1J:E and, using 
Q, it satisfies, for some a := o(u,t) > 0, 

ll®lll p( 7 l ltall» p(7) + telE p( 7 ) ltalll lp(7) <• e 5 (i + IMISP), (7.20) 
IMIWM-o + H^ 5 j[*]llsollXilU + Msol^xMls + \\ 9 M%xM\so 
< s e 5 (||7|| s+CT + |p5|| s+2 ||*lU 0 + 2 )- 
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Now we consider the contributions from H 2 o and H 4 o By Lemma 17.11 and 

the expressions of H 2 , H 4 in (13.41) we deduce that 


9„V(iL 2 o $ B )(Tg)[h] = -d xx h + n H2 (T 5 )[h\ , (7.21) 

d u V(H 4 o $ B )(T s )[h] = -3^ B (T 5 )) 2 h + K H4 (T s )[h] , (7.22) 


where 7 Zh 2 {u), 77// 4 (a) have the form (17.21) . By (17.31) . they have size 1Zh 2 (Ts) = 
0(e 2 ), 7Zh 4 {T$) = 0(e 4 ). More precisely, the functions gj,Xj i n 77 h 4 (Ts) satisfy the 
bounds in (17.201) with e 5 replaced by e 4 . Regarding TZh 2 (Ts), we need to find an 
exact formula for the terms of order e 2 . 

The sum of (17.181) . (17.211) and (17.221) gives a formula for d u VTL(Ts)[h], where the 
terms of form (17.21) and order e 2 are confined in TZ B2 (Ts). On the other hand, recall¬ 
ing U32D, H = H 2 +n i + 'H> 5 , and d u VH 2 (T s ) = -d xx , while d u \7H> 5 (T s ) = 0{s 3 ). 
Therefore all the terms of order e 2 in d v \7T-L{T^) can only come from dyfin-ifiTg). 
Using formula (13.81) for 7^4, we calculate 

U^(d u VHfiT s )[h}) = -3 gU^(Tfh) Vh € H s s± . 

Hence all the terms of order e 2 in Hg(d u 'V'H(Ts)[h\) are contained in the term 
—3<TI g(T 2 h) (and the term —3<TI g(T 2 h) is included in —3<TLj [(4 >b(T,5)) 2 /i] because 
<& B {Tg) = Ts + Vl/(T$)). As a consequence, HgTZ B2 (Ts) is of size 0(e 3 ), and its 
functions (jj , Xj (see (17.21) 1 satisfy (17.201) with e 5 replaced by e 3 . 

By Lemma 17131 and the results of this section we deduce: 

Proposition 7.4. Assume m- Then the Hamiltonian operator C u has the form, 
V/i € H S S± (T 1 '+ 1 ), 

C u h \= V^h - d x K 02 h = (VJi + d xx (aid x h) + d x (a 0 h ) - d x ll*h) (7.23) 


where 77* := 7 Zh 2 {Ts) + 7 Zh 4 (Ts) + 7 Zh > 5 (Ts) + R(fi) (with R(fi) defined in Lemma 
7.ffl and 1Zh 2 (Ts), 7 Z Bi (Tfi), 7 Zh > 5 (Ts) defined in () 7 . 181 ) . ( 17 . 211 ) . ( 17 . 221 ) ). the func¬ 


tions 


a\ := 1 — n(Tg ), o 0 := 3g($ B (T s )) 2 - r 0 (T s ) 
ro,ri are defined in (17.191) . and Ts in (17.81) . They satisfy 


(7.24) 


«i - lll^-W + IK - 3cr|||“ p W <» E 3 {1 + IIKIlifW), (7.25) 

||^ai[i]|| s + ||5i(a 0 - 3?T 2 )[?]|| S < s e 3 (||?l| s+(T + |p 5 || s+(7 ||?|| so+0 -) (7.26) 


where 3g(<p) := ~ 75 V &{&)•> z o{t)) corresponds to Ts. The remainder 1Z * has 

the form (17.21) . and its coefficients gj,Xj satisfy bounds (17.151) - 07.161) . 

Remark 7.5. For K = H + AM 2 , A = 3<j/4, the coefficient ao in (17.241) becomes 

a 0 = 3?7 To[(4>b(T<5)) 2 ] -r 0 (T s ), 

where ttq is defined in (11.321) . Thus the space average of ao has size 0(e 3 ). □ 

Bound (17.151) imply, by Lemma 17.21 estimates for the s-decay norms of 77*. The 
linearized operator := C u (u},ig(oj)) depends on the parameter ui both directly 
and also through the dependence on the torus ig{u). We have estimated also the 
partial derivative with respect to the variables i (see (15.11) 1 in order to control, 
along the nonlinear Nash-Moser iteration, the Lipschitz variation of the eigenvalues 
of Cu with respect to ui and the approximate solution ig. 


32 






























8 Reduction of the linearized operator in the normal 
directions 

The goal of this section is to conjugate the Hamiltonian linear operator C w in (17.231) 
to the constant coefficients linear operator defined in (18.641) . The proof is ob¬ 
tained applying different kind of symplectic transformations. We shall always assume 

m- 

8.1 Space reduction at the order d xxx 

As a first step, we symplectically conjugate the operator in (17.231) to C\ in (18.131) . 
which has the coefficient of d xxx independent on the space variable. Because of the 
Hamiltonian structure, this step also eliminates the terms 0(d xx ). 

We look for a ^-dependent family of symplectic diffeomorphisms <!>(<£>) of Hg 
which differ from 

A± := Hg-Ang , (Ah)(ip,x) \= {l + /3 x (tp,x))h(tp,x + l3((p,x)), (8.1) 

up to a small “finite dimensional” remainder, see (18.31) . For each the map 

A((p) is a symplectic map of the phase space, see Remark 3.3 in [3J. If ||/3||jyi,oo < 
1/2, then A is invertible (see Lemma 12.31) . and its inverse and adjoint maps are 

(A^h)^, y) = (1 + j3 y ((p, y))h{ip, y + j3(tp, y )), 

(A T h)(ip, y) = h(tp, y + /3{<p, y)) 

where x = y + f3(<p, y) is the inverse diffeomorphism (of T) of y = x + f3(<p, x). 

The restricted map A±_(tp) : Hg —> Hg is not symplectic. We have already ob¬ 
served in the introduction that A(ip) is the time-1 flow map of the linear Hamiltonian 
PDE (11.301) . The equation (11.301) is a linear transport equation, whose charactheris- 
tic curves are the solutions of the ODE 

^x = -b(<p,T,x). 

To obtain a symplectic transformation close to A±, we define a symplectic map 
<F of as the time 1 flow of the Hamiltonian PDE (11.311) . The linear operator 
n gd x (b(r,x)u) is the Hamiltonian vector field generated by the quadratic Hamil¬ 
tonian i f T b(T,x)u 2 dx restricted to Hg. The flow of (11.311) is well defined in the 
Sobolev spaces Hg ± (T x ) for b{ip,T,x) smooth enough, by standard theory of linear 
hyperbolic PDEs (see e.g. section 0.8 in [28j). The difference between the time 1 
flow map and A± is a “finite-dimensional” remainder of size 0(/3). 

Lemma 8.1 (Lemma 8.1 of [5]). For H^H^so+i.oo small, there exists an invertible 
symplectic transformation = Ax + 1Z$ of H s s± , where A± is defined in (18.11) and 
1 Z$ is a “finite-dimensional” remainder 

7Z$h = £/ (h, g.j ('7"))l 2 (t) Xj{ T ) dr + ^ ] ( h , ifj) £ 2 (T) e J (8-3) 

ieS Jo jeS 

for some functions Xj ( T ), 9 j ( r ) i V’i *= dd s satisfying for all r € [0,1] 

W^jWs + HSjMlls < s ||^||vy s+ 2 ,oc , ||xj(r)|| s < s 1 + \\/3\\ W s+i,oo . (8.4) 
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Moreover 


||$/i|| s + ll^/lll, < 


+ 11/311 


w a + 2 ,°° 


,0 V/i € Hg ± 


(8.5) 


We conjugate £ u in (|7.23[) via the symplectic map = A± + 77$ of Lemma 18.11 
Using the splitting 11^ = I — II 5 , we compute 

= +2?^ + ^4. + b'i-dyy + 61 + 60 ) + 77/ 1 ( 8 . 6 ) 

where the coefficients bi(tp, y), i = 0,1, 2, 3, are 

63 := -+ T [ai(l + /3a;) 3 ], b 2 ’■= A T [2(ai) x (l + f3 x ) 2 + 6ai{3 xx (l + f3 x )], (8-7) 

bi ■= A 1 [(2\,/3) + - ~~w~ + 4ai @ xxx + 6 ( a i)xPxx + ((ai) xx + ao)(l + /3 X )], 

^ I fix 

b 0 := A T 1 (2\A + ai/3 xxxx H - 2 (&l)rr Acre# + ((ai)xx + O,o)/3 XX ) + («o)a 

L 1 + p x \ J 

and the remainder 


77 / • Llg ((lld XXX T 2(fll) 3 ;3 3 ;3; “I - ((Ol)xX + C. () ) /);■ T ( Q<) ) X ) Hg ^4-Hg 

— ng<9 x 77**4_i_ + [77^)77$] + (£^1 — 22^)77$ . (8-8) 

The commutator [21^,77$] has the form (18.311 with V^gj or T> w Xji instead of 

Xj, 9 ji ipj respectively. Also the last term (d u — 2^)77$ in (18.8|) has the form (18.311 
(note that d u — V u does not contain derivatives with respect to ip). By (18.611 . and 


decomposing I = Lt^ + H^, we get 

+ b^dyyy + &2 + b\d y + &o)IIs + 77// , (8-9) 

77 a := { 11 ^ [A — I)Ag — 77<j>}(&3<9, m + b2d yy + b\dy + 6 o)IIg + 77/ . ( 8 . 10 ) 

Now we choose the function (3 = j3(ip,x) such that 

ai((p,x)(l + /3 x (<p,x )) 3 = b 3 {<p) (8.11) 

so that the coefficient 63 in (18.711 depends only on (p (note that 4 T [63 (<£>)] = b 3 (<p)). 
The only solution of (18.1111 with zero space average is (see e.g. [3]-section 3.1) /3 := 
S+Vo, where p 0 := 6 3 (^) 1 / 3 (ai(^,z )) _1/3 - 1 , and 

h(<p) = J (ai{v, x ))~ 1/3dx ) • ( 8 - 12 ) 

Applying the symplectic map <3 ?~ 1 in (18.911 we obtain the Hamiltonian operator (see 
Definition 12.211 

£1 := (w • 8^ + b 3 (ip)d yy y + bidy + 60 ) 1+5 + Tti (8.13) 


where Tvi := <h~ 1 77//. Note that the term b-^dyy has disappeared from (18.1311 because, 
by the Hamiltonian nature of £ 1 , the coefficient 62 = 2(b 3 ) y (see [3j-Remark 3.5) 
and therefore, by (18.120 . 62 = 2 {b 3 ) y = 0 . 

Lemma 8.2 (Lemma 8.2 of [5]). The operator A in (18.1311 has the form (17.40 . 
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Since aq = 1 + 0(e 3 ) and ao = 3?T| + 0(e 3 ) (see (17.25li - d7.26h for the precise 
estimates), by the usual composition estimates we deduce the following lemma. 

Lemma 8.3. There is a = ct(t, u) > 0 such that 

ll/5|li ip(l) + l|6s - l|lt lp(7) + 11*1 - 3 C r|||^W + 11*0 - 3 c (r|)*||;pM 


< s £ 3 (1 + IPsIli+i 7 ’), (8.14) 

ll<WHII. + IlftftaHII, + l|S s (* x - + 119,(60 - sqjJWffl, 

<• e 3 (M.-w + IP«ll.+„ra.o+«), (8.15) 

where T$ is defined in (17.81) . The transformations 4>, d )” 1 satisfy 

||4.ft|| plp W + <. ||h||S (7) + l|3 S |ffi 7) ll*CS ) (8.16) 

lla.(**)HII. + llfli(*- 1 *)[illl. <, ||6||.+o||?||» 0+ o + ||k||, 0+ „H» + „ (8.17) 

+ llb^lls+o-ll^llso+crll^llso+o- • 


Moreover the remainder DR has the form (EH). where the functions Xj( T )> 9j( T ) 
satisfy the estimates (I7.15D - (I7.16I) uniformly in r G [0,1]. 

8.2 Time reduction at the order d xxx 

The goal of this section is to get a constant coefficient in front of d yyy , using a 
quasi-periodic reparametrization of time. We consider the change of variable 

(Bw)(ip, y) := w(ip + ua(<p),y), (H _ 1 /i)( d, y) := h{d + ua(d),y ), (8.18) 

where TR —> T u , d i—>• tp = d + uja(d) is the inverse diffeomorphism of d = y} + ua{ip) 
in T u . By conjugation, the differential operators become 

■ d v B = p(d) oj ■ 8 $, B~ 1 d y B = d y , p := B ~ l (1 + ui ■ d v a ). (8.19) 

By (18.131) . using also that B and B 1 commute with 11^, the conjugate operator 
B~ l C\B is equal to 

n j\flU • dtf + (B-^dyyy + (B^bfijdy + (B^b 0 )]Il£ + B^^B. (8.20) 

We choose a such that (B~ 1 b 3 )(d) = m 3 p(d) for some constant m 3 G R, namely 

h(<p) = m 3 (l + u ■ d v a(tp)) (8.21) 

(recall ([8.191) 1. The unique solution with zero average of (18.211) is 

a(ip) ■= — (w • dy,)- 1 ^ - m 3 )(tp), m 3 := [ b 3 ((p)dip . ( 8 . 22 ) 

m 3 ^ (2it) u Jju 

Hence, by (18.201) . 

B~ X C\B = pC 2 , C -2 := (w • d$ + m 3 d yyy + c\d y + co)n<y + fH 2 (8.23) 
ci := p-^B-X), Co := p-^B-^o), K 2 := p' 1 B^B . (8.24) 

The transformed operator C 2 in (I8.23p is still Hamiltonian, because the repara¬ 
metrization of time preserves the Hamiltonian structure (see Section 2.2 and Remark 
3.7 in 0). 
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Lemma 8.4. There is a = a{y,r) > 0 (possibly larger than a in Lemma \ 8 .d!\) such 
that 


I m 3 ~ l| Lip(7) < Cs 3 , |<%ra 3 [i]| < C'e 3 ||7]| So+(J (8.25) 

ll Q lls ip(7) < s e 3 7- 1 (l + ||%||JK 7) ) 

ll«| s < s eV'd^ll.+a + I|a 5 || s+ ^L +CT ) 

Up -i||? p(7) <, e 3 (i + |p 5 |K 7) ) 

II^P^IIs <s e 3 (ll^|,+a + 11-jf(511 s+cr 11*11 s 0 +cr) 

|| Cl - 3 ? T|||^P(7) + He - 3?(lf < s e 5 7 _1 (l + IP*C5 7) )> ( 8 - 26 ) 

II di( Cl - 3?T|)[i]|| s + ||^(co - 3?(2?),)|5]||. 

e 5 7" 1 (II^IU+<T + ||3Ttf|U+er||*lUo+o-)- 

The transformations B, B~ 3 satisfy the estimates (18.161) . (18.171) . The remainder IR 2 
has the form (17.41) . and the functions gj{r), Xj( T ) satisfy the estimates (1 7.15 1) - (1 7.16 D 
for all t 6 [0, 1 ]. 

Proof. To estimate ||o:||s ip ^ 7 ^ we also differentiate (|8.22j) with respect to the parame¬ 
ter u. Note that ci — 3?i? _ 1 (T|) = 0(e 3 ), and similarly cq — 2>qB~~ l {(T^) x ) = 0(e 3 ). 
The factor e 5 7 _1 in the last two inequalities comes from the estimate of the difference 
B~\Tf) - T| ~ (Tj%a = 0(£ 2 £ 3 T - 1 ). □ 


8.3 Translation of the space variable 

In this section we remove the space average from the coefficient in front of d y . 
Consider the change of the space variable z = y + p{d) which induces on i7g X (T^ +1 ) 
the operators 

(Tw)(tf,y) := w(d,y +p(i?)), (T _ 1 /i)(i?, z) = h(ti, z - p(ti)) (8.27) 

(which are a particular case of those used in section l 8 Tl) . The differential operators 
become T _ 1 u; • d$T = w • + {cj • d^p(i!))}d z , T~ 3 d y T = d z . Since T, T ^ 1 commute 

with 11 ^, we get 

£ 3 := T _1 £ 2 T = (w • d# + m 3 d zzz + did z + d 0 )Ug + m 3 , (8.28) 

di := (T _ 1 ci) + w • d$p , d 0 := T~ l c 0 , Tv 3 := T _1 91 2 T. (8.29) 

We choose 

mi ~Wr*Lp iiMv ' v ■ atrfm, - T fc.iy) , (8.30) 

so that 

— [ di(d, z) dz = mi Vi? G T u . (8.31) 

2 vr J T 

Recalling (18.261) . we analyze the space average of c\ in more detail. To avoid am¬ 
biguity between the space variable y G T and the action y$ : T u —> W of (17.81) . 
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we rename x € T the space variable, and ip £ T u the variable on the torus (time 
variable). Let 

«(*>, *) : (8-32) 

where l : 5 —>• TT is the odd injective map (see (11.111) 1 

£(3i)'-=ei, £(~Ji) ■■= , i = l,...,u (8.33) 


and e* = (0,... , 1,..., 0) denotes the i-th vector of the canonical basis of MU In 
view of the next linear Birkhoff normal form step (whose goal is to normalize the 
term of size e 2 ), we observe that the component of order e 2 in T 2 (see m) is s 2 v 2 , 
with 

II if - eVII^M <, e 2 |Pi||“ p j 7) , 

ii am 2 - £ 2 s 2 )[?hi, <» j 2 (ii?ii, + „+ip 5 ||, + „ii?il + „) . 

Moreover, from (17.81) . since (vs, ^o)l 2 (t) = 0) and (fio)-j = — (#o )j for all j G S , we 
have 


I T 2 dx = e 2 j v 2 dx + e 2h [ dx = e 2 ^^ + e 2b ^\j\(ys)j + e 2b [ z%dx. 

Jf J T JT jeS jeS 

We define 

d\ := di — 3g£ 2 v 2 , do : = do — 3<;£ 2 (v 2 ) x , (8.35) 

and note that, by (|8.31l) and (18.321) . 


1 

27T 


/ 
J T 


d\ dx = mi — 


3g£ 2 

~ 2 tT 


/ 


-2 


dx = mi — e 2 c(£), 


c(0 := 3 

j£S 


(8.36) 


Using the explicit formulae above, and Lemma 17.21 for the estimate of 9x3, we get 
the following bounds. 


Lemma 8.5. 

that 


There is a := g(v,t) > 0 (possibly larger than in Lemma 8 . 4 ) such 


c( 0 | Lip(7) 

< C£ 5 7 

\ |5*mi[?]| < C , £ 26 H?|| so+(J 

(8.37) 

IIJ>lll p(7) 

< s 

eV 

2 +iPiiK 7) . 



Mil. 

< s 

||^| s+ff +£ 5 7 - 2 ||^l|s +( . 



Il4llt' p(7) 

<s 

^ 1~ 

2 +p 2 iPiiih p i 7) . 

k = 0,1, 

(8.38) 

\\did k \f]\\ s 

< s 


'HINU-ht + ||^|| 8 +<t||*1| so +«t) , k = o,i. 



The matrix s-decay norm (see (|2.4j ) ) of the operator 9 I 3 satisfies 
| W3 |Llp ( 7 ) <_ £ 14 *| Pj ||^7) _ 

ifliMsPii. <. e 1+t (w,+. + iPiiu»ra.o+<.) • 

The transformations T, T _1 satisfy (18.161) . ()8. 17f) . 
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Remark 8.6. When K = H + AM 2 , A = 3/4, the constant coefficient mi in (18.3011 
becomes of size 

|mi| Lip(7) < Ce 5 7 _1 - (8-40) 

The inequality (18.401) is the key difference between the cases H + (3<j/4 )M 2 and H 
(compare (I8.40|) with (18.371) . where mi contains the non-perturbative term e 2 c(£)). 

□ 

It is sufficient to estimate 9^3 (which has the form (17.41) ) only in the s-decay 
norm (see (|8. 391) 1 because the next transformations will preserve it. Such norms will 
be used in the reducibility scheme of section 18.61 

8.4 Linear Birkhoff normal form 

Now we normalize the terms of order e 2 of £ 3 . This step is different from the 
reducibility steps that we shall perform in section 18.61 the diophantine constant 
7 in (15.31) is 7 = o(e 2 ), and therefore the terms of order s 2 are not perturbative, 
because e 2 7 _1 is not small (in fact, it is big). The reduction of this section is possible 
thanks to the special form of the term e 2 B defined in (I8.4ip : the harmonics of e 2 B 
corresponding to a possible small divisor are naught, except £>j(0), see Lemma [8791 
Note that, since the previous linear transformations <h, B, T are 0 (e 5 7 ~ 2 )-close to 
the identity, the terms of order e 2 in £3 are the same as in the original linearized 
operator. 

First, we collect all the terms of order e 2 in the operator £3 in (I8.28p . We have 

£3 = II5 (u ■ d^p + m3d xxx + e 2 B + d\d x + do)lIg + Dyj 

where d\,do, TI 3 are defined in (18.35P , (18.291) and (recall (I8.32P ) 

Bh := 3 qv 2 d x h + 3 s(v 2 ) x h = d x (3sv 2 h). (8.41) 

Note that B is the linear Hamiltonian vector field of Hg generated by the Hamilto¬ 
nian z 1 y // Jj v 2 z 2 dx. 

We transform £3 by a symplectic operator 4>2 : iL^ J _(T I/+1 ) —> i7^ J _(T !y+1 ) of the 
form 

^ ^ - 2 (fc- 2 ) 

4 > 2 := exp(e 2 H) = I H ± + e 2 A + e 4 A, A := ^ ——— A k , (8.42) 

k> 2 

where A(<p)h = J2j {£)dj'e^ x is a Hamiltonian vector held. The map < 3?2 is 

symplectic, because it is the time 1 how of a Hamiltonian vector held. We calculate 

£ 3 4 > 2 - 4 > 2 n s(V u + m 3 d xxx ) 

= e 2 Iig{B + (V U A) + m 3 [d xxx ,A]}U^ + U^d x U^ + R 3 (8.43) 

where 

R 3 :=e 4 n^{(^l) + m 3 [a xx;c ,l] + B(A + £ 2 A)}Uj (8.44) 

+ - /) + (n£J 0 n£ + m 3 )^ 2 . 
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Remark 8.7. R 3 has no longer the form (17.41) . However R 3 = 0(d®) because 
A = 0(d~ 1 ) (see Lemma r8.12l) . and therefore <3? 2 — Ijj± = 0{d~ 1 ). Moreover the 
matrix decay norm of R 3 is o(e 2 ). □ 

In order to normalize the term of order e 2 of (18.431) . we develop A? {ip) = 
Ai {l)e ll ' v , and for each j,f E S c , l E IP, we choose 


4(0 := 


B] (0 


i(o; • l + m 3 {j ' 3 — j 3 )) 

0 otherwise. 


if uj • l + j 13 — j 3 7 ^ 0 , 


This definition is well posed. Indeed, by (18.411) and (18.321) . 


Bj (0 : = 3?i j Vtjitji ■ 

ji, 32 &S 

jl+j 2 =j~j' 


(8.45) 


(8.46) 


In particular £>j {1) = 0 unless \l\ < 2. For |Z| < 2 and u ■ l + j ' 3 — j 3 7 ^ 0, the 
denominators in (|8.45D satisfy 

|w -l + m 3 {j ' 3 - j 3 )| = |m 3 (u> ■ l + j ' 3 - j 3 ) + {uj - m 3 uj) • l\ 

> |m 3 ||u> ■ l + j ' 3 - j 3 \ -\uj- m 3 u\\l\ > 1/2 (8.47) 

for e small, because \uj • l + j ' 3 — j 3 | > 1 (u) • l + j 73 — J 3 is a nonzero integer), 
u = ui + 0(e 2 ) and by (18.251) . 

Remark 8.8. The operator A defined in (18.451) is Hamiltonian, because B is Hamil¬ 
tonian. The reason is a general fact: the denominators Sij.k '■= i(w • l + m 3 (fc 3 — j 3 )) 
satisfy Si j k = $-l,k,j and an operator G((p) is self-adjoint if and only if its matrix 
elements satisfy Gj{l) = G 3 k {—1), see [3]-Remark 4.5. Alternatively, we could solve 
the homological equation of this Birkhoff step directly for the Hamiltonian function 
whose flow generates d> 2 - □ 

By the definition (18.450 . the term of order e 2 in (18.431) is zero on the Fourier 
indices ( l,j,j ') such that uj ■ 1 + ;f 3 — j 3 / 0 , while it is equal to e 2 £> 7 (l) for ( l,j,j r ) 
such that uj ■ l + j' 3 — j 3 = 0. Now we prove that the only nonzero components of B 
that remain in (18.431) are £>j (0). 

Lemma 8 . 9 . If uj ■ l + j ' 3 — j 3 = 0 and B 3 - ( l ) 7^ 0, then l = 0 and j = j'. 

Proof. If B J - ( l ) 7 ^ 0, then, by (18.461) . there exist j 1 , j 2 € S such that j\ + j 2 = j — j' 
and l{j \) + l{j 2 ) = l- Hence, recalling (11.191) and (I8.33p . 

0 = u -l + f - j 3 = uj ■ £{j 1) + uj ■ i{j 2 ) + j ' 3 - j 3 = j\ + J2 + j ' 3 ~ j 3 - 

This equality, together with ji+j 2 +j’-j = 0, implies that (ji+h)(Ji+j')(j2+f) = 0 
by Lemma [221 Since ji,j 2 E S, j' E S c , the set S is symmetric, and 0 S, we 
deduce that the factors j\ + j ’ and j 2 + j' are nonzero. Hence j 1 + j 2 = 0, and 
therefore l = £{j \) + £{—ji) = 0 . □ 
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Thus, the only nonzero term of order e 2 in (18.431) is B J - (0). By (18.461) . we calculate 
£>j(0) = ijc(£), where c(£) is defined in (18.361) . Hence, by (18.451) . Lemma [8791 and 
(18.361) . the term of order e 2 in (18.431) is 

e 2 n ^{B + (V W A) + m 3 [d xxx , A]}n^ = e 2 c^)d x Uj . (8.48) 

Remark 8.10. When K = H + AM 2 , A = 3?/4, the operator in (18.411) becomes 
Bh = d x (3^TTo(v 2 )h). Hence £>j(0) = 0, and the right-hand side term in (18.481) is 
zero, namely the first step of linear Birkhoff normal form completely eliminates all 
the terms of order e 2 . □ 

We now estimate the transformation A. 

Lemma 8.11. ( i ) For all l G TP, j,j' G S c , 

Mfwiscfui + i/ir 1 , i^4oi iip < e- a (iji + i/ir 1 . (8-49) 

(ii) (AiYj ( l ) = 0 for all l G TP, j,j' G S c such that \j — j'\ > 2 Cs, where Cs := 

max { \j I : 3 € 5*}. 

Proof, (i) As already observed, for all \l\ > 2 one has B : i ( l ) = 0, and therefore 
A’- ( l ) = 0. For |Z| < 2, j / /, one has (since |w| < |w| + 1) 

\u ■ l + m 3 (j ' 3 -j 3 )| > |m 3 ||/ 3 - j 3 | - |w -Z| > \{j ' 2 +j 2 ) - 2 \u\ > \(j ' 2 + j 2 ) 

for (j' 2 + j 2 ) > C, for some constant C. Since also (18.471) holds, we deduce that, for 
all j + /, 

A J j(l)^0 =► \u ■ l + m 3 (j ' 3 - j 3 )\ >c(\j\ + \j'\) 2 . (8.50) 

On the other hand, if j = j' G S c , and l / 0, then Bj (/) = 0, and therefore 

A : i ( l ) = 0. For j = f and l = 0we also have Af- ( l ) = 0 because Co -l + j ' 3 — j 3 = 0. 
Hence (18.501) holds for all j,j'. By (|8.45j) . (18.501) . (18.461) we deduce the first bound in 
(18.491) . The Lipschitz bound follows similarly (use also \j — j'\ < 2 Cs)- (ii) follows 

by {835D-(EI6D. □ 

The previous lemma means that A = 0(|<9 X | -1 ). More precisely, we deduce the 
following bound. 

Lemma 8.12 (Lemma 8.19 of |5]). |M<9 x |s ip ^ + |<9 x A|s ip ^ < C(s). 

It follows that the symplectic map 4>2 in (|8.42l) is invertible for e small, with 
inverse 

p 2n-2 

(frf 1 = exp(— e 2 A) = I H ± + e 2 A , A := - -—(— A) n , 

2 > H s ’ A^ n > i n! v 1 ' (8.51) 

|i9 x |L ip W + |^i|L ip( 7)< C ( s ). 

By (18.431) and (18.481) we get the Hamiltonian operator 

£4 := < L 2 ~ 1 £ 3 <I >2 = H s ('Dlj + m 3 d xxx + (e 2 c(£) + + R A , (8.52) 

R a := (Qf 1 - I)Uj(£ 2 c(C) + di)d x Uj + 4> 2 ’ 1 R 3 . (8.53) 
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Lemma 8.13. There is a = o(v,t) > 0 (possibly larger than in Lemma \8.5\l such 
that 

<.EV 2 + £ 2 ||3 J ||“ P i 7) , 

\diRM. < S E 1 +< ’||t||, + , + E 2 p J || s+ „|^||, 0+< ,. 

Proof. Use iRTHl) . (18^21) . (1081) . (IQifl) . (jQ5j) and Lemma liU2l 

8.5 Space reduction at the order d x 

The goal of this section is to transform £5 in (|8.52l) so that the coefficient of d x 
becomes constant. We conjugate £4 via a symplectic map of the form 

5 := exp(n^r(u;djr 1 ))n^ = n ^ (I + wd-^Uj + S, (8.55) 

where S := J2k >2 H [n^ and w : T u+1 —>■ M is a function. Note that 

the linear operator is the Hamiltonian vector field generated by the 

Hamiltonian — \ J T w(df 1 h) 2 dx, h € Hg. We calculate 

£ 4 <S - SUg^ + m 3 d xxx + mid x )Ug 

= Ug(3m 3 w x + e 2 c(£) + di - mi)d x Ilg + R 5 , 

R 5 := Ug{(3m 3 w xx + (e 2 c(0 + d 1 - m^Ugw)^ 

+ ((V u w) + m 3 w xxx + (e 2 c(£) + d^UgW^dfi 1 
+ ifPwS) + m 3 [d xxx ,S] + (e 2 c(£) + dfi)d x S — m\Sd x + i? 4 5}n^ , 

where R 3 collects all the terms of order at most d x . By (18.361) . we solve 3m 3 w x 
+e 2 c(£) + d\ — mi = 0 by choosing w := —( 3 t 713 )" 1 5 “ 1 (£ 2 c(^) + d\ — mi). For e 
small the operator S is invertible, and we get 

C 5 ■= = Ug(V (JJ + m 3 d xxx + mid x )Ug + R 5 , R 5 :=S^ 1 R 5 . (8.56) 

Since S is symplectic, £5 is Hamiltonian (recall Definition 12. 2\) . By (I8.38p . (I8.37p . 
(I8.25P , one has |H|s ip(7) < s £ 7 7” 2 + £ 2 ||5 5 ||^ p j 7 ' ) . 

Lemma 8.14. There is a = ct(v,t) > 0 (possibly larger than in Lemma \8.13\) such 
that 

| 5 ±i_j|Lip(7) < s e 7 r 2 +e 2 ps\\YS\ 

l^5 ±1 [?)| s < s £ 2fc ||^| s+(T + £ 5 7 - 1 l|:u|U +ff ||% 0+(T • 

The remainder R§ satisfies the same estimates (18.541) as i? 4 . 

8.6 KAM reducibility and inversion of £ w 

The coefficients m 3 , m\ of the operator £5 in (18.561) are constants, and the remainder 
is a bounded operator of order d x with small matrix decay norm, see (I8.59p . Then 
we can diagonalize £5 by applying the iterative KAM reducibility Theorem 4.2 in 
[3] along the sequence of scales 

N n :=Nf, 77 = 0,1,2,..., y := 3/2, N 0 > 0. (8.57) 


(8.54) 

□ 
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In section [9] the initial No will (slightly) increase to infinity as e —> 0, see (19.51) . The 
required smallness condition (see (4.14) in [3]) is (written in the present notations) 

< 1 (8-58) 

where /3 := 7r + 6 (see (4.1) in [3]), r is the diophantine exponent in (15.31) and (I8.63[) . 
and the constant Cq := Cq (t, v) > 0 is hxed in Theorem 4.2 in |3|. By Lemma 18.141 
the remainder satisfies the bound (18.541) . and using (|7.5D we get (recall (15.91) 1 

IflsliSS? < Ce 7 T 2 = Ce 3 ~ 2a , IM^V 1 < Ce 7 ^ 3 = Ce 1 ^. (8.59) 

We use that /i in (17.51) is assumed to satisfy /i > er + f3 where a := a(r,u) is given 
in Lemma 18.141 


Theorem 8.15. (Reducibility) Assume that u i—>• is(uj) is a Lipschitz function 
defined on some subset Ll a C (recall (|5.2D ). satisfying (17.51) with p, > a + j3, 
where a := a(r,i/) is given in Lemma \8.1Jj\ and (5 := 7r + 6 . Then there exists 
<5o £ ( 0 , 1 ) such that, if 


A^ 0 Co e 7 7" 3 = IV^V -3 ® < <5 0 , 7 := e 2b := e 2+a , a £ ( 0 , 1 / 6 ) 


(8.60) 


then: 


(*) (Eigenvalues). For all uj £ Ll s there exists a sequence 

:= pf(uj,i 5 (uj)) := i( ~ m 3 (u)j 3 + rhi(uj)j) + r°°M, j £ S c , (8.61) 

where m 3 , mi coincide with the coefficients m 3 , mi of £5 in 1)8.561) for all uj £ Q 0 , 
and 

|m 3 - l| LipW < Ce 3 , |mi - £ 2 c(0| Lip(7) < Ce^~\ 

| r oo|Lip( T ) < Ce 3-2a V j g g c 

for some C > 0 (and c(£) is defined in (|8.36l) ). All the eigenvalues are purely 
imaginary. We define, for convenience, pffiiuj) := 0. 

(ii) (Conjugacy). For all uj in the set 


^ : = := {w G : \iw-l + pf(uo) - > 


271 / ~ k 3 1 

(l) T 

VI £ IF, Vj, k £ S c U {0}| 


(8.63) 


there is a real, bounded, invertible linear operator 4? 00 (A : H* ± ( TT+ 1 ) -> H s s± {T u+1 ), 
with bounded inverse 4>^ 0 1 (u.;), that conjugates Cq in (18.561) to constant coefficients, 
namely 

AoM := 0 AM o <J>ooM = u ■ d v + PooM, 

PooM := diag ieSc {/i°°(u;)} . 

T/ie transformations $ 00 , $ 7 / are close to the identity in matrix decay norm, with 


(8.64) 


I $00 - lQ%> + K - I\:^ <S e 7 !- 3 + £ 2 7 _1 


Lip ( 7 ) 

S+C7 


(8.65) 


Moreover 4> 00 ,4 > 00 1 are symplectic, and C^ is a Hamiltonian operator. 
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Proof. The proof closely follows the one of Theorem 4.1 in [3], which is based on 
Theorem 4.2, Corollaries 4.1, 4.2 and Lemmata 4.1, 4.2 of [3]. Here uo £ IT, while 
in [3J the parameter A £ R, but Kirszbraun’s Theorem on Lipschitz extension also 
holds in M". The bound (I8.65P follows by Corollary 4.1 of [3] and the estimate of R§ 
in Lemma 18.141 above. 

To adapt the proof of [3| to the present case, the only changes in the statement 
of Theorem 4.2 of |3j are: e 3 ^ 2a instead of e in (4.18) of [3], and s 1+b instead of e in 
(4.23), (4.25) and (4.26) of [3|. The factor e 1+b comes from the bound for diR§, see 
Lemma f8. 141 and (18.541) . □ 

Remark 8.16. Theorem 4.2 in |3j also provides the Lipschitz dependence of the 
(approximate) eigenvalues gh- with respect to the unknown io(ip), which is used for 
the measure estimate (Lemma 19.31) . □ 

All the parameters uj £ satisfy (specialize (I8.63P for k = 0) 

|i lo-1 + > 2 7 |j| 3 (0“ T , VI £ Z", j £ S c , (8.66) 

and the diagonal operator C 00 is invertible. 

In the following theorem we verify the inversion assumption (16.261) for C u . 

Theorem 8.17. (Inversion of C^) Assume the hypotheses of Theorem, I A. 151 and 
(|8.60l) . Then there exists cti := cri(r, v) > 0 such that, Vu £ fl^o(is) (see (18.631) ). 
for any function g £ H(ff ai (T i ' +1 ) the equation C^h = g has a solution h = C^g £ 
H* ± (V /+1 ), satisfying 

IIC'9lli lpW <. 7- 1 (ll9ll?i: ) + eV'IPoIlK’llsIl^W) • (8.67) 

Proof. See the proof of Theorem 8.16 in |5]. □ 

9 The Nash-Moser nonlinear iteration 

In this section we prove Theorem 15.11 It will be a consequence of the Nash-Moser 
Theorem ED below. 

Consider the finite-dimensional subspaces 

E n : = {3(</?) = (@,y,z)(ip) : 0 = n„0, y = U n y, z = U n z} 

where N n := N(f are introduced in (18.571) . and n n are the projectors (which, with 
a small abuse of notation, we denote with the same symbol) 

n„0(¥>) := ]T QieP-v, U n z(cp,x) := £ Zlj e< l ^ x \ (9.1) 

|/|<iV„ }(l,j)\<Nn 

where Q(ip) = and z(ip,x) = Yheirjes* (for n n y(ip) sim¬ 

ilar definition as for n„0(y?)). We define := I — n n . The classical smoothing 
properties hold: for all a, s > 0, 

l|n„3|lh p i 7) < A?IPIlI p(7) V3M €if‘, 
l|n 7 3|L Llp w < «-«||3||^ p L> V3H € H‘+r 
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We define the constants 

Hi := 3/x + 9 , a := 3/ii + 1, a\ := (a — 3/i)/2 , (9-3) 

k:=3(/ii+/) x ) + 1, A := 6/ii + 3/? X +3, 0 < p < ^ 5 (9.4) 

where h : = m(t, is the “loss of regularity” defined in Theorem l6.9l fsee (16.351) 1 and 
C i is fixed below. 

Theorem 9.1. (Nash-Moser) Assume that f G C q with q > so + fii + p + 3. Let 
t > v + 2. Then there exist C\ > max{/ri + a, Co} (where Cq := Co(r, v) is the one 
in Theorem \8.15\) . <5o := 5 o(t, v) > 0 such that, if 

iV 0 Cl e 6 *+ 2 7 " 2 <S 0 , 7 := e 2+a = e 26 , iV 0 := (eV 3 f , K := 5 - 2b , (9.5) 

then, for all n > 0: 

(Pl) n f/iere exists a function (J n ,g n ) : f/n C —> E n _\ x a; i->- (J n (ca), £ n (w)), 

(3 0 ,Co) := 0, E _i := {0}, satisfying |Cn| Lip(7) < C||P([/ n )||^ p(7) , 

MX? < C^N -1 , ||^n)||^$i_3 <ae 6 *, (9.6) 

where U n := ( i n ,Cn ) with i n (ip) = ((£>,0,0) + 3 n (</>). 77ie sets C? n are defined 
inductively by: 

<?0 := {u> en e : \u> - l\> ^ VI €Z V \ {0}} , 
g n+l := [u G g n : lia; • l + /if (*„) - /if (*„)| > ^ 

Vj,fc G S c U{0}, IeZ M }, (9.7) 

where 7 n := 7(1 + 2~ n ) and /if(w) := /if (w,z n (w)) are defined in (18.611) (and 

/if (w) = Of 

T/ie difference J n := — 3 n _i (where we set Jo := 0) is defined on g n , and 

it satisfies 

l|5iC+ ( ?<C , *eS" 1 , l|5 n|lK ) <a£ b * 7“ 1 ^ n -_ a 1 1 Vn > 1. (9.8) 

[V2) n ||P(C n )||^ p(7) < C^iVff w/iere we set iV_i := 1. 

(P3) n (High norms). ||3j^ p(7) < and ||m»)l£+ft < C^N^. 

(P4) n (Measure). The measure of the “Cantor-like” sets g n satisfies 

|fi e \ Go\ < C^-V'y , \g n \ g n+ 1| < C^-^iVff . (9.9) 

All the Lip norms are defined on g n , namely || ||s ip ^ = || 
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Proof. To simplify notations, in this proof we denote || || Lip (fo by || ||. 

Step 1 : Proof of ("Pl,2,3)o. Recalling (15.611 we have ||J r (C/o)|| s = ||J r (<^, 0, 0,0) || s 
= \\Xp(ip, 0,0)|| s < s s 5 ~ 2b by Lemma 15.31 Hence (recall that 6* := 5 — 2b) the 
smallness conditions in ('Pl)o-('P3)o hold taking C* := C*(so + fii) large enough. 

Step 2: Assume that ( / Pl,2,3) n hold for some n > 0, and prove (VI, 2,3) n+ i. 
The proof of this step closely follows Step 2 in the proof of Theorem 9.1 of [5], We 
just mention the main changes: here it is convenient to define 

w n := £ 2 'y~ 2 \\P(U n )\\ S0 , B n := £ 2 'y~ 1 \\3 n \\ So+ p 1 + e 2 7 ~ 2 ||.F(Lr n )|| So+/ 3 1 , (9.10) 

while the corresponding quantities defined in (9.18) of [5] have e instead of e 2 (and 
then, with definition (19.101) . the bounds (9.19) of [5] are also valid here without 
changes). In the present case, the estimates (9.20)-(9.21) of [5j for the quadratic 
Taylor remainder have to be adapted by replacing the factor e with e 2 . The reason 
for this improvement is that the nonlinearity in the mKdV equation is cubic, whereas 
in the KdV equation considered in [5] the nonlinearity is just quadratic. 

Remark 9.2. Since the KdV, respectively rnKdV, nonlinearity is quadratic, re¬ 
spectively cubic, the smallness condition required in [5] for the convergence of the 
Nash-Moser scheme is stronger than for Theorem 19. II it is e|| T(<p, 0, 0)|| so + At 7~ 2 -C 1 
instead of e 2 1| V(ip, 0,0) || S0+At 7 -2 Cl. Asa consequence less steps of Birkhoff normal 
form are required (namely less monomials to work out in the original Hamiltonian) 
to reach the sufficient smallness J-(Uq) = 0(s 5 ~ 2b ) to make the Nash-Moser scheme 
to converge (in m it is needed F(Uq) = 0(s 6 2b )). □ 

Step 3: Prove (W) n for all n > 0. For all n > 0, the difference Q n \ G n +i is the 
union over l £ IP, j , k £ S c U {0} of the sets Rijk(i n ), where 

Rijk(in ) := : |i u ■ l + pLf(i n ) - /4°(4)| < ^ n \j 3 ~ k 3 \ ( l)~ T } . (9.11) 

Since Rijk(in ) = 0 for j = k, in the sequel we assume that j k. 

Lemma 9.3. For n > 1. |Z| < i, one has the inclusion Rijk(i n ) V Rijk(i n - 1 )- 

Proof. The proof closely follows the one of Lemma 5.2 in [3]- The differences are that 
here the vector a ; is not confined along a fixed direction, here we have i instead 
of N n , and the factor e in (5.28) and (5.33) of [3j is replaced here by e 7 7 2 = e 3 2a . 

In the proof we use (19.81) . (18.591) . (18.25p . (18.371) . and the bounds (4.25), (4.26), 
(4.34) of [3] adapted to the present case (the bounds (4.25), (4.26) of [3] hold here 

with e l+b instead of e, as already pointed out in the proof of Theorem 18.151 the 

bound (4.34) of [3] holds here with no change). □ 

By definition, Rijk(i n ) 7 Q n (see (19.lip ). By Lemma T9.31 for n > 1 and \l\ < 
N n _i we also have Rijk(i n ) 7 Rijk(i n -i)- On the other hand, Rijk(i n - 1 ) H Q n = 0 
(see (ED)- As a consequence, Rijk(i n ) = 0 for all |/| < V n _i, and 

Qn \ Gn+i V Rijk(in ) Vn > 1. (9.12) 

i,fce5 c u{0} 

|i|>JV n _i 

Lemma 9.4. Let n > 0. If Rijk(i n ) f 1 0- then |Z| > Cilj 3 — k 3 \ > \C\(j 2 + k 2 ) for 
some constant C\ > 0 (independent of l,j,k,n,i n ,ui). 
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Proof. Follow the proof of Lemma 5.3 of [3], also using (18.621) . Note that |w| < 2\uj\ 
for all u G fi £ , for e small enough, by (15.21) and (I4.1UI) . □ 

Now we study the measure of the resonant sets Rijkifn) defined in (19.111) . We 
have to analyze in more details the sublevels of the function 

u) ) := i u ■ l + (9.13) 

appearing in (19.111) (0 also depends on l,j,k,i n ). 

Lemma 9.5. There exists Co > 0 such that for all j / k, with j 2 + k 2 > Co, the 
set Rijkifn ) has Lebesgue measure \Rijk(i n )\ < Ce 2 ( u ~ 1 'l'y(l)~ T . 

Proof. For l y 0, decompose u = si + v, where l := l/\l\, sgK, and l ■ v = 0 (so 
that co ■ l = s|Z|). Let ip(s) := <f>(sl + u). The eigenvalues are given in (18.611) . By 
(18.361) and (15.41) . £ 2 |c(£)| lip < C 2 for some constant C 2 > 0 depending only on the 
set S of the tangential sites. Then, by (I8.62|) and (12.21) . 

|m 3 (si) - m 3 (s 2 )| < Ce 3 'y^ 1 \s 1 - s 2 |, 

|mi(si) - mi(s 2 )| < (C 2 + C'e 5 7 _2 )|si - s 2 | < 2 C 2 |si - s 2 |, 

|rf ( Sl ) - rf (s 2 )| < Ce 3_2a 7 _1 |s 1 - s 2 | 

for some C > 0 and e small enough, where, with a slight abuse of notations, we have 
written 


rhi(s) = rhi(sl + v ), * = 1,3 and r°°(s) = r? 0 (si + v), j G S c . 

By (18.611) and Lemma IPl 


|V^l)-^2)|>(|i|-C£ 3 7 - 1 |j 3 -fc 3 |- 

> |j 3 -fc 3 |(Ci-C £ 3 7 - 1 

> ^7 I j 3 - & 3 ||si - s 2 | 


2C 2 \j-k\-2C£ 3 - 2a 1 - 1 )\s 1 -s 2 \ 

2C 2 \j-k\ 2C , £ 3_2a 7 _1 \ 

■ |j3 — /j31 |j3_ fc 3| JI S 1- S 2 


for e small enough and j 2 + k 2 + jk > Co := 12C 2 /Ci. As a consequence, the set 
A ijk(i n ) := {s : si + v G Rijk(in)} has Lebesgue measure 


|A;jfe(*yi)| 8 


2 4y n | j 3 - k 3 1 

Ci\p-p\ (iy 



for some C > 0. The lemma follows by Fubini’s Theorem. □ 

Remark 9.6. When K = H + \M 2 , A = 3/4, using (18.401) . the conclusion of Lemma 
19.51 holds without restrictions on j, k. □ 

It remains to estimate the measure of the finitely many resonant sets Rijk(in ) 
for j 2 + k 2 < Co- Recalling (|8.36l) and the parity = fj, we write c(£) = 6?1 • £ 
where 1 is the vector (!,...,!) eR" and £ = (£ ? ) je s+ G MA Hence, by (15.4p . 


£ 2 c(£) = 6?1 • A 1 [ca — w] = 6?A T l-[o; — ca] (9-14) 
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where A 7 is the transpose of A 1 . We write the function (f)(ui) in (19.131) as 

= a jk + bijk ■ u + qjk(u ), 


where 


a jk '■= -i(j 3 - k 3 + 6 q{j - k) 1 • A _1 u;), 
bijk ■= i {l + 6?(j - k)A~ T T), 

q jk (u) := -i(m 3 - l)(j 3 - A; 3 ) + i(mi - e 2 c(£))0‘ ~ + r T ~ 

(and m 3 , 777-1 , £, r j°j r fc° a U depend on w). By (|8.62l) and since j 2 + k 2 < Co we deduce 
that 7 /j| Lip C) < Ce 3 ~ 2a . Recalling (12.21) we get 

M SUP < Ce 3 - 2a , \q jk \ lip < 7 _1 |^| Lip(7) < Ce l ~ 3a (9.15) 

so that 4>(u) is a small perturbation of the affine function ui *->■ dj k + bij k ■ ui. By 
the next lemma, the hypothesis ( 11 . 121 ) on the tangential sites S allows to verify that 
such function does not vanish identically. 

Lemma 9.7. Assume (11.121) . Then, for all j k, j 2 + k 2 < Co it results aj k 7 ^ 0. 
Proof. Using formulae (j 1.19 1) and (14.111) . we calculate 


1 • A 1 w = — 


1 

3?(2i/ 


V 



Hence 

/ 2 v \ 
ajk = —i (j - k) (j 2 + jk + k 2 - 7 ^—y 3 i) ± 0 

2=1 

by assumption ( 11 . 121 ) on the set S. □ 

Lemma IfTTl implies that 5 := min{|ajfc| : j 2 + k 2 < Co, j 7 ^ k} > 0. 

Lemma 9.8. Assume ( 11 . 121 ) . If j 2 + k 2 < Co, then \Rij k (i n )\ < Ce 2 ^ l '~ 1 ^(l)~ T . 

Proof. Denote b := bijk for brevity. For j 2 + k 2 < Co, w G Rijk(in), one has, by 

(El, (ESI), 

\b-u>\> \a jk \ - \4>{u)\ - \qj k (u)\ >5- 2~/ n \j 3 - A 3 |<Z>^ r - Ce 3 ~ 2a > 5/2 

for e small enough. On the other hand, \b ■ uj\ < 2|w||6| because |w| < 2\uj\ (see 
(15.21) and (14.101) 1. Hence \b\ > 5 1 where 5\ := <5/(4|o)|) > 0. Split oj = sb + v where 
b := b/\b\ and v ■ b = 0. Let ip(s) := <f>(sb + v). By (19.151) . for e small enough, we get 

|V’(si) - tp(s 2 )\ > (l&l - \qjk\ hp )\si - s 2 \ > y |si - s 2 | • 

Then we proceed similarly as in the proof of Lemma 19.51 □ 

The proof of (19.91) follows from the lemmata 19.31 . .., 19.81 proceeding like in [3] 
(see the conclusion of the proof of Theorem 5.1 in |3]). □ 
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Proof of Theorem 15.11 concluded. The conclusion of the proof of Theorem 15.11 
follows exactly like in [S] (see “Proof of Theorem 5.1 concluded” in [5]). 

Remark 9.9. By remark 19.61 Lemma 19.71 (which is the only point in the paper 
where assumption 111.121) is used) is not needed any more. Thus Theorem 11.11 applies 
to K = H + (3?/4)Af 2 without assuming hypothesis (11 . 12|) . □ 
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